Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 19088

Search results for: data warehousing

19088 Generic Data Warehousing for Consumer Electronics Retail Industry

Authors: S. Habte, K. Ouazzane, P. Patel, S. Patel

Abstract:

The dynamic and highly competitive nature of the consumer electronics retail industry means that businesses in this industry are experiencing different decision making challenges in relation to pricing, inventory control, consumer satisfaction and product offerings. To overcome the challenges facing retailers and create opportunities, we propose a generic data warehousing solution which can be applied to a wide range of consumer electronics retailers with a minimum configuration. The solution includes a dimensional data model, a template SQL script, a high level architectural descriptions, ETL tool developed using C#, a set of APIs, and data access tools. It has been successfully applied by ASK Outlets Ltd UK resulting in improved productivity and enhanced sales growth.

Keywords: consumer electronics, data warehousing, dimensional data model, generic, retail industry

Procedia PDF Downloads 306
19087 Big Brain: A Single Database System for a Federated Data Warehouse Architecture

Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf

Abstract:

Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.

Keywords: data integration, data warehousing, federated architecture, Online Analytical Processing (OLAP)

Procedia PDF Downloads 154
19086 Privacy Preserving Data Publishing Based on Sensitivity in Context of Big Data Using Hive

Authors: P. Srinivasa Rao, K. Venkatesh Sharma, G. Sadhya Devi, V. Nagesh

Abstract:

Privacy Preserving Data Publication is the main concern in present days because the data being published through the internet has been increasing day by day. This huge amount of data was named as Big Data by its size. This project deals the privacy preservation in the context of Big Data using a data warehousing solution called hive. We implemented Nearest Similarity Based Clustering (NSB) with Bottom-up generalization to achieve (v,l)-anonymity. (v,l)-Anonymity deals with the sensitivity vulnerabilities and ensures the individual privacy. We also calculate the sensitivity levels by simple comparison method using the index values, by classifying the different levels of sensitivity. The experiments were carried out on the hive environment to verify the efficiency of algorithms with Big Data. This framework also supports the execution of existing algorithms without any changes. The model in the paper outperforms than existing models.

Keywords: sensitivity, sensitive level, clustering, Privacy Preserving Data Publication (PPDP), bottom-up generalization, Big Data

Procedia PDF Downloads 205
19085 Application of Knowledge Discovery in Database Techniques in Cost Overruns of Construction Projects

Authors: Mai Ghazal, Ahmed Hammad

Abstract:

Cost overruns in construction projects are considered as worldwide challenges since the cost performance is one of the main measures of success along with schedule performance. To overcome this problem, studies were conducted to investigate the cost overruns' factors, also projects' historical data were analyzed to extract new and useful knowledge from it. This research is studying and analyzing the effect of some factors causing cost overruns using the historical data from completed construction projects. Then, using these factors to estimate the probability of cost overrun occurrence and predict its percentage for future projects. First, an intensive literature review was done to study all the factors that cause cost overrun in construction projects, then another review was done for previous researcher papers about mining process in dealing with cost overruns. Second, a proposed data warehouse was structured which can be used by organizations to store their future data in a well-organized way so it can be easily analyzed later. Third twelve quantitative factors which their data are frequently available at construction projects were selected to be the analyzed factors and suggested predictors for the proposed model.

Keywords: construction management, construction projects, cost overrun, cost performance, data mining, data warehousing, knowledge discovery, knowledge management

Procedia PDF Downloads 238
19084 The Study of Cost Accounting in S Company Based on TDABC

Authors: Heng Ma

Abstract:

Third-party warehousing logistics has an important role in the development of external logistics. At present, the third-party logistics in our country is still a new industry, the accounting system has not yet been established, the current financial accounting system of third-party warehousing logistics is mainly in the traditional way of thinking, and only able to provide the total cost information of the entire enterprise during the accounting period, unable to reflect operating indirect cost information. In order to solve the problem of third-party logistics industry cost information distortion, improve the level of logistics cost management, the paper combines theoretical research and case analysis method to reflect cost allocation by building third-party logistics costing model using Time-Driven Activity-Based Costing(TDABC), and takes S company as an example to account and control the warehousing logistics cost. Based on the idea of “Products consume activities and activities consume resources”, TDABC put time into the main cost driver and use time-consuming equation resources assigned to cost objects. In S company, the objects focuses on three warehouse, engaged with warehousing and transportation (the second warehouse, transport point) service. These three warehouse respectively including five departments, Business Unit, Production Unit, Settlement Center, Security Department and Equipment Division, the activities in these departments are classified by in-out of storage forecast, in-out of storage or transit and safekeeping work. By computing capacity cost rate, building the time-consuming equation, the paper calculates the final operation cost so as to reveal the real cost. The numerical analysis results show that the TDABC can accurately reflect the cost allocation of service customers and reveal the spare capacity cost of resource center, verifies the feasibility and validity of TDABC in third-party logistics industry cost accounting. It inspires enterprises focus on customer relationship management and reduces idle cost to strengthen the cost management of third-party logistics enterprises.

Keywords: third-party logistics enterprises, TDABC, cost management, S company

Procedia PDF Downloads 291
19083 Futuristic Black Box Design Considerations and Global Networking for Real Time Monitoring of Flight Performance Parameters

Authors: K. Parandhama Gowd

Abstract:

The aim of this research paper is to conceptualize, discuss, analyze and propose alternate design methodologies for futuristic Black Box for flight safety. The proposal also includes global networking concepts for real time surveillance and monitoring of flight performance parameters including GPS parameters. It is expected that this proposal will serve as a failsafe real time diagnostic tool for accident investigation and location of debris in real time. In this paper, an attempt is made to improve the existing methods of flight data recording techniques and improve upon design considerations for futuristic FDR to overcome the trauma of not able to locate the block box. Since modern day communications and information technologies with large bandwidth are available coupled with faster computer processing techniques, the attempt made in this paper to develop a failsafe recording technique is feasible. Further data fusion/data warehousing technologies are available for exploitation.

Keywords: flight data recorder (FDR), black box, diagnostic tool, global networking, cockpit voice and data recorder (CVDR), air traffic control (ATC), air traffic, telemetry, tracking and control centers ATTTCC)

Procedia PDF Downloads 489
19082 Shortening Distances: The Link between Logistics and International Trade

Authors: Felipe Bedoya Maya, Agustina Calatayud, Vileydy Gonzalez Mejia

Abstract:

Encompassing inventory, warehousing, and transportation management, logistics is a crucial predictor of firm performance. This has been extensively proven by extant literature in business and operations management. Logistics is also a fundamental determinant of a country's ability to access international markets. Available studies in international and transport economics have shown that limited transport infrastructure and underperforming transport services can severely affect international competitiveness. However, the evidence lacks the overall impact of logistics performance-encompassing all inventory, warehousing, and transport components- on global trade. In order to fill this knowledge gap, the paper uses a gravitational trade model with 155 countries from all geographical regions between 2007 and 2018. Data on logistics performance is obtained from the World Bank's Logistics Performance Index (LPI). First, the relationship between logistics performance and a country’s total trade is estimated, followed by a breakdown by the economic sector. Then, the analysis is disaggregated according to the level of technological intensity of traded goods. Finally, after evaluating the intensive margin of trade, the relevance of logistics infrastructure and services for the extensive trade margin is assessed. Results suggest that: (i) improvements in both logistics infrastructure and services are associated with export growth; (ii) manufactured goods can significantly benefit from these improvements, especially when both exporting and importing countries increase their logistics performance; (iii) the quality of logistics infrastructure and services becomes more important as traded goods are technology-intensive; and (iv) improving the exporting country's logistics performance is essential in the intensive margin of trade while enhancing the importing country's logistics performance is more relevant in the extensive margin.

Keywords: gravity models, infrastructure, international trade, logistics

Procedia PDF Downloads 107
19081 Application of a Compact Wastewater Treatment Unit in a Rural Area

Authors: Mohamed El-Khateeb

Abstract:

Encompassing inventory, warehousing, and transportation management, logistics is a crucial predictor of firm performance. This has been extensively proven by extant literature in business and operations management. Logistics is also a fundamental determinant of a country's ability to access international markets. Available studies in international and transport economics have shown that limited transport infrastructure and underperforming transport services can severely affect international competitiveness. However, the evidence lacks the overall impact of logistics performance-encompassing all inventory, warehousing, and transport components- on global trade. In order to fill this knowledge gap, the paper uses a gravitational trade model with 155 countries from all geographical regions between 2007 and 2018. Data on logistics performance is obtained from the World Bank's Logistics Performance Index (LPI). First, the relationship between logistics performance and a country’s total trade is estimated, followed by a breakdown by the economic sector. Then, the analysis is disaggregated according to the level of technological intensity of traded goods. Finally, after evaluating the intensive margin of trade, the relevance of logistics infrastructure and services for the extensive trade margin is assessed. Results suggest that: (i) improvements in both logistics infrastructure and services are associated with export growth; (ii) manufactured goods can significantly benefit from these improvements, especially when both exporting and importing countries increase their logistics performance; (iii) the quality of logistics infrastructure and services becomes more important as traded goods are technology-intensive; and (iv) improving the exporting country's logistics performance is essential in the intensive margin of trade while enhancing the importing country's logistics performance is more relevant in the extensive margin.

Keywords: low-cost, recycling, reuse, solid waste, wastewater treatment

Procedia PDF Downloads 111
19080 Enhancing Warehousing Operation In Cold Supply Chain Through The Use Of IOT And Lifi Technologies

Authors: Sarah El-Gamal, Passent Hossam, Ahmed Abd El Aziz, Rojina Mahmoud, Ahmed Hassan, Dalia Hilal, Eman Ayman, Hana Haytham, Omar Khamis

Abstract:

Several concerns fall upon the supply chain, especially the cold supply chain. According to the literature, the main challenges in the cold supply chain are the distribution and storage phases. In this research, researchers focused on the storage area, which contains several activities such as the picking activity that faces a lot of obstacles and challenges The implementation of IoT solutions enables businesses to monitor the temperature of food items, which is perhaps the most critical parameter in cold chains. Therefore, researchers proposed a practical solution that would help in eliminating the problems related to ineffective picking for products, especially fish and seafood products, by using IoT technology, most notably LiFi technology. Thus, guaranteeing sufficient picking, reducing waste, and consequently lowering costs. A prototype was specially designed and examined. This research is a single case study research. Two methods of data collection were used; observation and semi-structured interviews. Semi-structured interviews were conducted with managers and decision maker at Carrefour Alexandria to validate the problem and the proposed practical solution using IoTandLiFi technology. A total of three interviews were conducted. As a result, a SWOT analysis was achieved in order to highlight all the strengths and weaknesses of using the recommended Lifi solution in the picking process. According to the investigations, it was found that the use of IoT and LiFi technology is cost effective, efficient, and reduces human errors, minimize the percentage of product waste and thus save money and cost. Thus, increasing customer satisfaction and profits gained.

Keywords: cold supply chain, picking process, temperature control, IOT, warehousing, LIFI

Procedia PDF Downloads 66
19079 Optimization of Lubricant Distribution with Alternative Coordinates and Number of Warehouses Considering Truck Capacity and Time Windows

Authors: Taufik Rizkiandi, Teuku Yuri M. Zagloel, Andri Dwi Setiawan

Abstract:

Distribution and growth in the transportation and warehousing business sector decreased by 15,04%. There was a decrease in Gross Domestic Product (GDP) contribution level from rank 7 of 4,41% in 2019 to 3,81% in rank 8 in 2020. A decline in the transportation and warehousing business sector contributes to GDP, resulting in oil and gas companies implementing an efficient supply chain strategy to ensure the availability of goods, especially lubricants. Fluctuating demand for lubricants and warehouse service time limits are essential things that are taken into account in determining an efficient route. Add depots points as a solution so that demand for lubricants is fulfilled (not stock out). However, adding a depot will increase operating costs and storage costs. Therefore, it is necessary to optimize the addition of depots using the Capacitated Vehicle Routing Problem with Time Windows (CVRPTW). This research case study was conducted at an oil and gas company that produces lubricants from 2019 to 2021. The study results obtained the optimal route and the addition of a depot with a minimum additional cost. The total cost remains efficient with the addition of a depot when compared to one depot from Jakarta.

Keywords: CVRPTW, optimal route, depot, tabu search algorithm

Procedia PDF Downloads 23
19078 Processing Big Data: An Approach Using Feature Selection

Authors: Nikat Parveen, M. Ananthi

Abstract:

Big data is one of the emerging technology, which collects the data from various sensors and those data will be used in many fields. Data retrieval is one of the major issue where there is a need to extract the exact data as per the need. In this paper, large amount of data set is processed by using the feature selection. Feature selection helps to choose the data which are actually needed to process and execute the task. The key value is the one which helps to point out exact data available in the storage space. Here the available data is streamed and R-Center is proposed to achieve this task.

Keywords: big data, key value, feature selection, retrieval, performance

Procedia PDF Downloads 165
19077 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: big data, learning analytics, analytics, big data in education, Hadoop

Procedia PDF Downloads 274
19076 Analysis of Big Data

Authors: Sandeep Sharma, Sarabjit Singh

Abstract:

As per the user demand and growth trends of large free data the storage solutions are now becoming more challenge-able to protect, store and to retrieve data. The days are not so far when the storage companies and organizations are start saying 'no' to store our valuable data or they will start charging a huge amount for its storage and protection. On the other hand as per the environmental conditions it becomes challenge-able to maintain and establish new data warehouses and data centers to protect global warming threats. A challenge of small data is over now, the challenges are big that how to manage the exponential growth of data. In this paper we have analyzed the growth trend of big data and its future implications. We have also focused on the impact of the unstructured data on various concerns and we have also suggested some possible remedies to streamline big data.

Keywords: big data, unstructured data, volume, variety, velocity

Procedia PDF Downloads 419
19075 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, WangQun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSQL), and gives 6 data cleaning methods based on these algorithms.

Keywords: data cleaning, dependency rules, violation data discovery, data repair

Procedia PDF Downloads 338
19074 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: mining big data, big data, machine learning, telecommunication

Procedia PDF Downloads 289
19073 Multi-Source Data Fusion for Urban Comprehensive Management

Authors: Bolin Hua

Abstract:

In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.

Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data

Procedia PDF Downloads 286
19072 Reviewing Privacy Preserving Distributed Data Mining

Authors: Sajjad Baghernezhad, Saeideh Baghernezhad

Abstract:

Nowadays considering human involved in increasing data development some methods such as data mining to extract science are unavoidable. One of the discussions of data mining is inherent distribution of the data usually the bases creating or receiving such data belong to corporate or non-corporate persons and do not give their information freely to others. Yet there is no guarantee to enable someone to mine special data without entering in the owner’s privacy. Sending data and then gathering them by each vertical or horizontal software depends on the type of their preserving type and also executed to improve data privacy. In this study it was attempted to compare comprehensively preserving data methods; also general methods such as random data, coding and strong and weak points of each one are examined.

Keywords: data mining, distributed data mining, privacy protection, privacy preserving

Procedia PDF Downloads 383
19071 The Right to Data Portability and Its Influence on the Development of Digital Services

Authors: Roman Bieda

Abstract:

The General Data Protection Regulation (GDPR) will come into force on 25 May 2018 which will create a new legal framework for the protection of personal data in the European Union. Article 20 of GDPR introduces a right to data portability. This right allows for data subjects to receive the personal data which they have provided to a data controller, in a structured, commonly used and machine-readable format, and to transmit this data to another data controller. The right to data portability, by facilitating transferring personal data between IT environments (e.g.: applications), will also facilitate changing the provider of services (e.g. changing a bank or a cloud computing service provider). Therefore, it will contribute to the development of competition and the digital market. The aim of this paper is to discuss the right to data portability and its influence on the development of new digital services.

Keywords: data portability, digital market, GDPR, personal data

Procedia PDF Downloads 351
19070 Recent Advances in Data Warehouse

Authors: Fahad Hanash Alzahrani

Abstract:

This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.

Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing

Procedia PDF Downloads 303
19069 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 420
19068 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: clustering, data mining, DBSCAN, k-means, k-medoids, sensor data

Procedia PDF Downloads 285
19067 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 56
19066 Government Big Data Ecosystem: A Systematic Literature Review

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Data that is high in volume, velocity, veracity and comes from a variety of sources is usually generated in all sectors including the government sector. Globally public administrations are pursuing (big) data as new technology and trying to adopt a data-centric architecture for hosting and sharing data. Properly executed, big data and data analytics in the government (big) data ecosystem can be led to data-driven government and have a direct impact on the way policymakers work and citizens interact with governments. In this research paper, we conduct a systematic literature review. The main aims of this paper are to highlight essential aspects of the government (big) data ecosystem and to explore the most critical socio-technical factors that contribute to the successful implementation of government (big) data ecosystem. The essential aspects of government (big) data ecosystem include definition, data types, data lifecycle models, and actors and their roles. We also discuss the potential impact of (big) data in public administration and gaps in the government data ecosystems literature. As this is a new topic, we did not find specific articles on government (big) data ecosystem and therefore focused our research on various relevant areas like humanitarian data, open government data, scientific research data, industry data, etc.

Keywords: applications of big data, big data, big data types. big data ecosystem, critical success factors, data-driven government, egovernment, gaps in data ecosystems, government (big) data, literature review, public administration, systematic review

Procedia PDF Downloads 51
19065 A Machine Learning Decision Support Framework for Industrial Engineering Purposes

Authors: Anli Du Preez, James Bekker

Abstract:

Data is currently one of the most critical and influential emerging technologies. However, the true potential of data is yet to be exploited since, currently, about 1% of generated data are ever actually analyzed for value creation. There is a data gap where data is not explored due to the lack of data analytics infrastructure and the required data analytics skills. This study developed a decision support framework for data analytics by following Jabareen’s framework development methodology. The study focused on machine learning algorithms, which is a subset of data analytics. The developed framework is designed to assist data analysts with little experience, in choosing the appropriate machine learning algorithm given the purpose of their application.

Keywords: Data analytics, Industrial engineering, Machine learning, Value creation

Procedia PDF Downloads 64
19064 Providing Security to Private Cloud Using Advanced Encryption Standard Algorithm

Authors: Annapureddy Srikant Reddy, Atthanti Mahendra, Samala Chinni Krishna, N. Neelima

Abstract:

In our present world, we are generating a lot of data and we, need a specific device to store all these data. Generally, we store data in pen drives, hard drives, etc. Sometimes we may loss the data due to the corruption of devices. To overcome all these issues, we implemented a cloud space for storing the data, and it provides more security to the data. We can access the data with just using the internet from anywhere in the world. We implemented all these with the java using Net beans IDE. Once user uploads the data, he does not have any rights to change the data. Users uploaded files are stored in the cloud with the file name as system time and the directory will be created with some random words. Cloud accepts the data only if the size of the file is less than 2MB.

Keywords: cloud space, AES, FTP, NetBeans IDE

Procedia PDF Downloads 110
19063 Business Intelligence for Profiling of Telecommunication Customer

Authors: Rokhmatul Insani, Hira Laksmiwati Soemitro

Abstract:

Business Intelligence is a methodology that exploits the data to produce information and knowledge systematically, business intelligence can support the decision-making process. Some methods in business intelligence are data warehouse and data mining. A data warehouse can store historical data from transactional data. For data modelling in data warehouse, we apply dimensional modelling by Kimball. While data mining is used to extracting patterns from the data and get insight from the data. Data mining has many techniques, one of which is segmentation. For profiling of telecommunication customer, we use customer segmentation according to customer’s usage of services, customer invoice and customer payment. Customers can be grouped according to their characteristics and can be identified the profitable customers. We apply K-Means Clustering Algorithm for segmentation. The input variable for that algorithm we use RFM (Recency, Frequency and Monetary) model. All process in data mining, we use tools IBM SPSS modeller.

Keywords: business intelligence, customer segmentation, data warehouse, data mining

Procedia PDF Downloads 327
19062 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Saeed Hassan Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analysing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics

Procedia PDF Downloads 421
19061 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework

Authors: Lutful Karim, Mohammed S. Al-kahtani

Abstract:

Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.

Keywords: big data, clustering, tree topology, data aggregation, sensor networks

Procedia PDF Downloads 224
19060 Control the Flow of Big Data

Authors: Shizra Waris, Saleem Akhtar

Abstract:

Big data is a research area receiving attention from academia and IT communities. In the digital world, the amounts of data produced and stored have within a short period of time. Consequently this fast increasing rate of data has created many challenges. In this paper, we use functionalism and structuralism paradigms to analyze the genesis of big data applications and its current trends. This paper presents a complete discussion on state-of-the-art big data technologies based on group and stream data processing. Moreover, strengths and weaknesses of these technologies are analyzed. This study also covers big data analytics techniques, processing methods, some reported case studies from different vendor, several open research challenges and the chances brought about by big data. The similarities and differences of these techniques and technologies based on important limitations are also investigated. Emerging technologies are suggested as a solution for big data problems.

Keywords: computer, it community, industry, big data

Procedia PDF Downloads 44
19059 High Performance Computing and Big Data Analytics

Authors: Branci Sarra, Branci Saadia

Abstract:

Because of the multiplied data growth, many computer science tools have been developed to process and analyze these Big Data. High-performance computing architectures have been designed to meet the treatment needs of Big Data (view transaction processing standpoint, strategic, and tactical analytics). The purpose of this article is to provide a historical and global perspective on the recent trend of high-performance computing architectures especially what has a relation with Analytics and Data Mining.

Keywords: high performance computing, HPC, big data, data analysis

Procedia PDF Downloads 416