Search results for: data warehousing

7355 A Soft Systems Methodology Perspective on Data Warehousing Education Improvement

Abstract:

This paper demonstrates how the soft systems methodology can be used to improve the delivery of a module in data warehousing for fourth year information technology students. Graduates in information technology needs to have academic skills but also needs to have good practical skills to meet the skills requirements of the information technology industry. In developing and improving current data warehousing education modules one has to find a balance in meeting the expectations of various role players such as the students themselves, industry and academia. The soft systems methodology, developed by Peter Checkland, provides a methodology for facilitating problem understanding from different world views. In this paper it is demonstrated how the soft systems methodology can be used to plan the improvement of data warehousing education for fourth year information technology students.

Keywords: Data warehousing, education, soft systems methodology, stakeholders, systems thinking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1649

7354 Improved Data Warehousing: Lessons Learnt from the Systems Approach

Authors: Roelien Goede

Abstract:

Data warehousing success is not high enough. User dissatisfaction and failure to adhere to time frames and budgets are too common. Most traditional information systems practices are rooted in hard systems thinking. Today, the great systems thinkers are forgotten by information systems developers. A data warehouse is still a system and it is worth investigating whether systems thinkers such as Churchman can enhance our practices today. This paper investigates data warehouse development practices from a systems thinking perspective. An empirical investigation is done in order to understand the everyday practices of data warehousing professionals from a systems perspective. The paper presents a model for the application of Churchman-s systems approach in data warehouse development.

Keywords: Data warehouse development, Information systemsdevelopment, Interpretive case study, Systems thinking

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1533

7353 Generic Data Warehousing for Consumer Electronics Retail Industry

Authors: S. Habte, K. Ouazzane, P. Patel, S. Patel

Abstract:

The dynamic and highly competitive nature of the consumer electronics retail industry means that businesses in this industry are experiencing different decision making challenges in relation to pricing, inventory control, consumer satisfaction and product offerings. To overcome the challenges facing retailers and create opportunities, we propose a generic data warehousing solution which can be applied to a wide range of consumer electronics retailers with a minimum configuration. The solution includes a dimensional data model, a template SQL script, a high level architectural descriptions, ETL tool developed using C#, a set of APIs, and data access tools. It has been successfully applied by ASK Outlets Ltd UK resulting in improved productivity and enhanced sales growth.

Keywords: Consumer electronics retail, dimensional data model, data analysis, generic data warehousing, reporting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1332

7352 Issues and Architecture for Supporting Data Warehouse Queries in Web Portals

Authors: Minsoo Lee, Yoon-kyung Lee, Hyejung Yoon, Soo-kyung Song, Sujeong Cheong

Abstract:

Data Warehousing tools have become very popular and currently many of them have moved to Web-based user interfaces to make it easier to access and use the tools. The next step is to enable these tools to be used within a portal framework. The portal framework consists of pages having several small windows that contain individual data warehouse query results. There are several issues that need to be considered when designing the architecture for a portal enabled data warehouse query tool. Some issues need special techniques that can overcome the limitations that are imposed by the nature of data warehouse queries. Issues such as single sign-on, query result caching and sharing, customization, scheduling and authorization need to be considered. This paper discusses such issues and suggests an architecture to support data warehouse queries within Web portal frameworks.

Keywords: Data Warehousing tools, data warehousing queries, web portal frameworks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2076

7351 Big Brain: A Single Database System for a Federated Data Warehouse Architecture

Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf

Abstract:

Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.

Keywords: Data integration, data warehousing, federated architecture, online analytical processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 654

7350 The Relevance of Data Warehousing and Data Mining in the Field of Evidence-based Medicine to Support Healthcare Decision Making

Authors: Nevena Stolba, A Min Tjoa

Abstract:

Evidence-based medicine is a new direction in modern healthcare. Its task is to prevent, diagnose and medicate diseases using medical evidence. Medical data about a large patient population is analyzed to perform healthcare management and medical research. In order to obtain the best evidence for a given disease, external clinical expertise as well as internal clinical experience must be available to the healthcare practitioners at right time and in the right manner. External evidence-based knowledge can not be applied directly to the patient without adjusting it to the patient-s health condition. We propose a data warehouse based approach as a suitable solution for the integration of external evidence-based data sources into the existing clinical information system and data mining techniques for finding appropriate therapy for a given patient and a given disease. Through integration of data warehousing, OLAP and data mining techniques in the healthcare area, an easy to use decision support platform, which supports decision making process of care givers and clinical managers, is built. We present three case studies, which show, that a clinical data warehouse that facilitates evidence-based medicine is a reliable, powerful and user-friendly platform for strategic decision making, which has a great relevance for the practice and acceptance of evidence-based medicine.

Keywords: data mining, data warehousing, decision-support systems, evidence-based medicine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3733

7349 The Comparison of Anchor and Star Schema from a Query Performance Perspective

Authors: Radek Němec

Abstract:

Today's business environment requires that companies have access to highly relevant information in a matter of seconds. Modern Business Intelligence tools rely on data structured mostly in traditional dimensional database schemas, typically represented by star schemas. Dimensional modeling is already recognized as a leading industry standard in the field of data warehousing although several drawbacks and pitfalls were reported. This paper focuses on the analysis of another data warehouse modeling technique - the anchor modeling, and its characteristics in context with the standardized dimensional modeling technique from a query performance perspective. The results of the analysis show information about performance of queries executed on database schemas structured according to principles of each database modeling technique.

Keywords: Data warehousing, anchor modeling, star schema, anchor schema, query performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3270

7348 The Data Mining usage in Production System Management

Authors: Pavel Vazan, Pavol Tanuska, Michal Kebisek

Abstract:

The paper gives the pilot results of the project that is oriented on the use of data mining techniques and knowledge discoveries from production systems through them. They have been used in the management of these systems. The simulation models of manufacturing systems have been developed to obtain the necessary data about production. The authors have developed the way of storing data obtained from the simulation models in the data warehouse. Data mining model has been created by using specific methods and selected techniques for defined problems of production system management. The new knowledge has been applied to production management system. Gained knowledge has been tested on simulation models of the production system. An important benefit of the project has been proposal of the new methodology. This methodology is focused on data mining from the databases that store operational data about the production process.

Keywords: data mining, data warehousing, management of production system, simulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3439

7347 Towards Development of Solution for Business Process-Oriented Data Analysis

Authors: M. Klimavicius

Abstract:

This paper proposes a modeling methodology for the development of data analysis solution. The Author introduce the approach to address data warehousing issues at the at enterprise level. The methodology covers the process of the requirements eliciting and analysis stage as well as initial design of data warehouse. The paper reviews extended business process model, which satisfy the needs of data warehouse development. The Author considers that the use of business process models is necessary, as it reflects both enterprise information systems and business functions, which are important for data analysis. The Described approach divides development into three steps with different detailed elaboration of models. The Described approach gives possibility to gather requirements and display them to business users in easy manner.

Keywords: Data warehouse, data analysis, business processmanagement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1352

7346 The Study of Cost Accounting in S Company Based On TDABC

Authors: Heng Ma

Abstract:

Third-party warehousing logistics has an important role in the development of external logistics. At present, the third-party logistics in our country is still a new industry, the accounting system has not yet been established, the current financial accounting system of third-party warehousing logistics is mainly in the traditional way of thinking, and only able to provide the total cost information of the entire enterprise during the accounting period, unable to reflect operating indirect cost information. In order to solve the problem of third-party logistics industry cost information distortion, improve the level of logistics cost management, the paper combines theoretical research and case analysis method to reflect cost allocation by building third-party logistics costing model using Time-Driven Activity-Based Costing(TDABC), and takes S company as an example to account and control the warehousing logistics cost.Based on the idea of “Products consume activities and activities consume resources”, TDABC put time into the main cost driver and use time-consuming equation resources assigned to cost objects. In S company, the objects focuses on three warehouse, engaged with warehousing and transportation (the second warehouse, transport point) service. These three warehouse respectively including five departments, Business Unit, Production Unit, Settlement Center, Security Department and Equipment Division, the activities in these departments are classified by in-out of storage forecast, in-out of storage or transit and safekeeping work. By computing capacity cost rate, building the time-consuming equation, the paper calculates the final operation cost so as to reveal the real cost.The numerical analysis results show that the TDABC can accurately reflect the cost allocation of service customers and reveal the spare capacity cost of resource center, verifies the feasibility and validity of TDABC in third-party logistics industry cost accounting. It inspires enterprises focus on customer relationship management and reduces idle cost to strengthen the cost management of third-party logistics enterprises.

Keywords: Third-party logistics enterprises, TDABC, cost management, S company.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2375

7345 Dimensional Modeling of HIV Data Using Open Source

Authors: Charles D. Otine, Samuel B. Kucel, Lena Trojer

Abstract:

Selecting the data modeling technique for an information system is determined by the objective of the resultant data model. Dimensional modeling is the preferred modeling technique for data destined for data warehouses and data mining, presenting data models that ease analysis and queries which are in contrast with entity relationship modeling. The establishment of data warehouses as components of information system landscapes in many organizations has subsequently led to the development of dimensional modeling. This has been significantly more developed and reported for the commercial database management systems as compared to the open sources thereby making it less affordable for those in resource constrained settings. This paper presents dimensional modeling of HIV patient information using open source modeling tools. It aims to take advantage of the fact that the most affected regions by the HIV virus are also heavily resource constrained (sub-Saharan Africa) whereas having large quantities of HIV data. Two HIV data source systems were studied to identify appropriate dimensions and facts these were then modeled using two open source dimensional modeling tools. Use of open source would reduce the software costs for dimensional modeling and in turn make data warehousing and data mining more feasible even for those in resource constrained settings but with data available.

Keywords: About Database, Data Mining, Data warehouse, Dimensional Modeling, Open Source.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1910

7344 XML Schema Automatic Matching Solution

Authors: Huynh Quyet Thang, Vo Sy Nam

Abstract:

Schema matching plays a key role in many different applications, such as schema integration, data integration, data warehousing, data transformation, E-commerce, peer-to-peer data management, ontology matching and integration, semantic Web, semantic query processing, etc. Manual matching is expensive and error-prone, so it is therefore important to develop techniques to automate the schema matching process. In this paper, we present a solution for XML schema automated matching problem which produces semantic mappings between corresponding schema elements of given source and target schemas. This solution contributed in solving more comprehensively and efficiently XML schema automated matching problem. Our solution based on combining linguistic similarity, data type compatibility and structural similarity of XML schema elements. After describing our solution, we present experimental results that demonstrate the effectiveness of this approach.

Keywords: XML Schema, Schema Matching, SemanticMatching, Automatic XML Schema Matching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1783

7343 Multidimensional Performance Management

Authors: David Wiese

Abstract:

In order to maximize efficiency of an information management platform and to assist in decision making, the collection, storage and analysis of performance-relevant data has become of fundamental importance. This paper addresses the merits and drawbacks provided by the OLAP paradigm for efficiently navigating large volumes of performance measurement data hierarchically. The system managers or database administrators navigate through adequately (re)structured measurement data aiming to detect performance bottlenecks, identify causes for performance problems or assessing the impact of configuration changes on the system and its representative metrics. Of particular importance is finding the root cause of an imminent problem, threatening availability and performance of an information system. Leveraging OLAP techniques, in contrast to traditional static reporting, this is supposed to be accomplished within moderate amount of time and little processing complexity. It is shown how OLAP techniques can help improve understandability and manageability of measurement data and, hence, improve the whole Performance Analysis process.

Keywords: Data Warehousing, OLAP, Multidimensional Navigation, Performance Diagnosis, Performance Management, Performance Tuning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2085

7342 CSOLAP (Continuous Spatial On-Line Analytical Processing)

Authors: Taher Omran Ahmed, Abdullatif Mihdi Buras

Abstract:

Decision support systems are usually based on multidimensional structures which use the concept of hypercube. Dimensions are the axes on which facts are analyzed and form a space where a fact is located by a set of coordinates at the intersections of members of dimensions. Conventional multidimensional structures deal with discrete facts linked to discrete dimensions. However, when dealing with natural continuous phenomena the discrete representation is not adequate. There is a need to integrate spatiotemporal continuity within multidimensional structures to enable analysis and exploration of continuous field data. Research issues that lead to the integration of spatiotemporal continuity in multidimensional structures are numerous. In this paper, we discuss research issues related to the integration of continuity in multidimensional structures, present briefly a multidimensional model for continuous field data. We also define new aggregation operations. The model and the associated operations and measures are validated by a prototype.

Keywords: Continuous Data, Data warehousing, DecisionSupport, SOLAP

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1541

7341 Enhancing Warehousing Operations in Cold Supply Chain through the Use of IoT and LiFi Technologies

Authors: S. El-Gamal, P. Hossam, A. Abd El Aziz, R. Mahmoud, A. Hassan, D. Hilal, E. Ayman, H. Haytham, O. Khamis

Abstract:

Several concerns fall upon the supply chain especially in cold supply chains. These concerns are mainly in the distribution and storage phases. This research focuses on the storage area, which contains several activities such as the picking activity that faces a lot of obstacles and challenges. The implementation of IoT solutions enables businesses to monitor the temperature of food items, which is perhaps the most critical parameter in cold chains. Therefore, the research at hand proposes a practical solution that would help in eliminating the problems related to ineffective picking for products especially fish and seafood products by using IoT technology, most notably LiFi technology; thus, guaranteeing sufficient picking, reducing waste, and consequently lowering costs. A prototype was specially designed and examined. This research is a single case study research. Two methods of data collection were used; observation and semi-structured interviews. Semi-structured interviews were conducted with managers and a decision maker at one of the biggest retail stores Carrefour, Alexandria, Egypt to validate the problem and the proposed practical solution using IoT and LiFi technology. A total of three interviews were conducted. As a result, a SWOT analysis was achieved in order to highlight all the strengths and weaknesses of using the recommended LiFi solution in the picking process. According to the investigations, it was found that, the use of IoT and LiFi technology is cost effective, efficient, and reduces human errors, minimizes the percentage of product waste and thus saves money and cost. Therefore, increasing customer satisfaction and profits could be achieved.

Keywords: Cold supply chain, IoT, LiFi, warehousing operation, picking process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 395

7340 Futuristic Black Box Design Considerations and Global Networking for Real Time Monitoring of Flight Performance Parameters

Authors: K. Parandhama Gowd

Abstract:

The aim of this research paper is to conceptualize, discuss, analyze and propose alternate design methodologies for futuristic Black Box for flight safety. The proposal also includes global networking concepts for real time surveillance and monitoring of flight performance parameters including GPS parameters. It is expected that this proposal will serve as a failsafe real time diagnostic tool for accident investigation and location of debris in real time. In this paper, an attempt is made to improve the existing methods of flight data recording techniques and improve upon design considerations for futuristic FDR to overcome the trauma of not able to locate the block box. Since modern day communications and information technologies with large bandwidth are available coupled with faster computer processing techniques, the attempt made in this paper to develop a failsafe recording technique is feasible. Further data fusion/data warehousing technologies are available for exploitation.

Keywords: Flight data recorder (FDR), black box, diagnostic tool, global networking, cockpit voice and data recorder (CVDR), air traffic control (ATC), air traffic, telemetry, tracking and control centers ATTTCC).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1401

7339 Testing the Performance of Rival Warehousing Policies through Discrete Event Simulation

Authors: João Vilas-Boas, Abdul Suleman, Luis Moreira

Abstract:

This research tested the performance of alternative warehouse designs concerning the picking process. The chosen performance measures were Travel Distance and Total Fulfilment Time. An explanatory case study was built up around a model implemented with SIMUL8. Hypotheses were set by selecting outcomes from the literature survey matching popular empirical findings. 17.4% reductions were found for Total Fulfilment Time and Resource Utilisation. The latter was then used as a proxy for operational efficiency. Literal replication of theoretical data-patterns was considered as an internal validity sign. Assessing the estimated changes benefits ahead of implementation was found to be a contribution to practice.

Keywords: Warehouse discrete-event simulation, Storage policy selection and assessment, Performance evaluation of order picking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2066

7338 Design of Distribution Network for Gas Cylinders in Jordan

Authors: Hazem J. Smadi

Abstract:

Performance of a supply chain is directly related to a distribution network that entails the location of storing materials or products and how products are delivered to the end customer through different stages in the supply chain. This study analyses the current distribution network used for delivering gas cylinders to end customer in Jordan. Evaluation of current distribution has been conducted across customer service components. A modification on the current distribution network in terms of central warehousing in each city in the country improves the response time and customer experience.

Keywords: Distribution network, gas cylinder, Jordan, supply chain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1643

7337 Reliability of Intra-Logistics Systems – Simulating Performance Availability

Authors: Steffen Schieweck, Johannes Dregger, Sascha Kaczmarek, Michael ten Hompel

Abstract:

Logistics distributors face the issue of having to provide increasing service levels while being forced to reduce costs at the same time. Same-day delivery, quick order processing and rapidly growing ranges of articles are only some of the prevailing challenges. One key aspect of the performance of an intra-logistics system is how often and in which amplitude congestions and dysfunctions affect the processing operations. By gaining knowledge of the so called ‘performance availability’ of such a system during the planning stage, oversizing and wasting can be reduced whereas planning transparency is increased. State of the art for the determination of this KPI is simulation studies. However, their structure and therefore their results may vary unforeseeably. This article proposes a concept for the establishment of ‘certified’ and hence reliable and comparable simulation models.

Keywords: Intra-logistics, performance availability, simulation, warehousing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2362

7336 Application the Queuing Theory in the Warehouse Optimization

Authors: Jaroslav Masek, Juraj Camaj, Eva Nedeliakova

Abstract:

The aim of optimization of store management is not only designing the situation of store management itself including its equipment, technology and operation. In optimization of store management we need to consider also synchronizing of technological, transport, store and service operations throughout the whole process of logistic chain in such a way that a natural flow of material from provider to consumer will be achieved the shortest possible way, in the shortest possible time in requested quality and quantity and with minimum costs. The paper deals with the application of the queuing theory for optimization of warehouse processes. The first part refers to common information about the problematic of warehousing and using mathematical methods for logistics chains optimization. The second part refers to preparing a model of a warehouse within queuing theory. The conclusion of the paper includes two examples of using queuing theory in praxis.

Keywords: Queuing theory, logistics system, mathematical methods, warehouse optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6481

7335 Fish Marketing: A Panacea towards Sustainable Agriculture in Ogun State, Nigeria

Authors: A. M. Omoare, E. O. Fakoya, B. G. Abiona, W. O. Oyediran

Abstract:

This study assessed fish marketing as panacea towards sustainable agriculture in Ogun State, Nigeria. Multi-stage sampling technique was used in the selection of 150 fish marketers for this study. Descriptive statistics were used for the objectives while Product Pearson Moment Correlation was used to test the hypothesis. Result of the findings revealed that the mean age of the respondents was 38.60 years. Majority (93.33%) of the respondents had acceptable levels of formal education. Many (44.00%) of the respondents had spent 1-5 years in fish marketing. The average quantity of fish sold in a day was 94.10kg. However, efficient fish marketing were hindered by inadequate processing equipment, storage rooms and ice holding facilities (86.67%). There was a significant relationship between socio-economic characteristics and profit realized from fish marketing (p < 0.05). It was recommended that storage and warehousing facilities should be provided to the fish marketers in the study area.

Keywords: Fish marketers, panacea, retail markets, sustainable.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2222

7334 A Fuzzy Implementation for Optimization of Storage Locations in an Industrial AS/RS

Authors: C. Senanayake, S. Veera Ragavan

Abstract:

Warehousing is commonly used in factories for the storage of products until delivery of orders. As the amount of products stored increases it becomes tedious to be carried out manually. In recent years, the manual storing has converted into fully or partially computer controlled systems, also known as Automated Storage and Retrieval Systems (AS/RS). This paper discusses an ASRS system, which was designed such that the best storage location for the products is determined by utilizing a fuzzy control system. The design maintains the records of the products to be/already in store and the storage/retrieval times along with the availability status of the storage locations. This paper discusses on the maintenance of the above mentioned records and the utilization of the concept of fuzzy logic in order to determine the optimum storage location for the products. The paper will further discuss on the dynamic splitting and merging of the storage locations depending on the product sizes.

Keywords: ASRS, fuzzy control systems, MySQL database, dynamic splitting and merging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2090

7333 Renovation of Industrial Zones in Ho Chi Minh City: An Approach from Changing Function of Processing to Urban Warehousing

Authors: Thu Le Thi Bao

Abstract:

Industrial parks have both active roles in promoting economic development, and source of appearance of boarding houses and slums in the adjacent area, lacking infrastructure, causing many social matters. The context of recent pandemic and climate change on a global scale pose issues that need to be resolved for sustainable development. Ho Chi Minh city aims to develop housing for migrant workers to stabilize human resources and at the same time, solve problems of social evils caused by poor living conditions. The paper focuses on the content of renovating existing industrial parks and worker accommodation in Ho Chi Minh city to propose appropriate models, contributing to the goal of urban embellishment and solutions for industrial parks to adapt to decree 29/2008/ND-CP abnormal impact conditions such as pandemics, climate change, crises.

Keywords: Industrial park, social housing, accommodation, distribution centre.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 109

7332 Big Data: Big Challenges to Privacy and Data Protection

Authors: Abu Bakar Munir, Siti Hajar Mohd Yasin, Firdaus Muhammad-Sukki

Abstract:

This paper seeks to analyse the benefits of big data and more importantly the challenges it pose to the subject of privacy and data protection. First, the nature of big data will be briefly deliberated before presenting the potential of big data in the present days. Afterwards, the issue of privacy and data protection is highlighted before discussing the challenges of implementing this issue in big data. In conclusion, the paper will put forward the debate on the adequacy of the existing legal framework in protecting personal data in the era of big data.

Keywords: Big data, data protection, information, privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3856

7331 Order Optimization of a Telecommunication Distribution Center through Service Lead Time

Authors: Tamás Hartványi, Ferenc Tóth

Abstract:

European telecommunication distribution center performance is measured by service lead time and quality. Operation model is CTO (customized to order) namely, a high mix customization of telecommunication network equipment and parts. CTO operation contains material receiving, warehousing, network and server assembly to order and configure based on customer specifications. Variety of the product and orders does not support mass production structure. One of the success factors to satisfy customer is to have a proper aggregated planning method for the operation in order to have optimized human resources and highly efficient asset utilization. Research will investigate several methods and find proper way to have an order book simulation where practical optimization problem may contain thousands of variables and the simulation running times of developed algorithms were taken into account with high importance. There are two operation research models that were developed, customer demand is given in orders, no change over time, customer demands are given for product types, and changeover time is constant.

Keywords: CTO, aggregated planning, demand simulation, changeover time.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 745

7330 Data Preprocessing for Supervised Leaning

Authors: S. B. Kotsiantis, D. Kanellopoulos, P. E. Pintelas

Abstract:

Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.

Keywords: Data mining, feature selection, data cleaning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5925

7329 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: Analytics, Big Data in Education, Hadoop, Learning Analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4815

7328 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2561

7327 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1515

7326 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2418