Search results for: geospatial data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2749

Search results for: geospatial data

2749 Faculty Use of Geospatial Tools for Deep Learning in Science and Engineering Courses

Authors: Laura Rodriguez Amaya

Abstract:

Advances in science, technology, engineering, and mathematics (STEM) are viewed as important to countries’ national economies and their capacities to be competitive in the global economy. However, many countries experience low numbers of students entering these disciplines. To strengthen the professional STEM pipelines, it is important that students are retained in these disciplines at universities. Scholars agree that to retain students in universities’ STEM degrees, it is necessary that STEM course content shows the relevance of these academic fields to their daily lives. By increasing students’ understanding on the importance of these degrees and careers, students’ motivation to remain in these academic programs can also increase. An effective way to make STEM content relevant to students’ lives is the use of geospatial technologies and geovisualization in the classroom. The Geospatial Revolution, and the science and technology associated with it, has provided scientists and engineers with an incredible amount of data about Earth and Earth systems. This data can be used in the classroom to support instruction and make content relevant to all students. The purpose of this study was to find out the prevalence use of geospatial technologies and geovisualization as teaching practices in a USA university. The Teaching Practices Inventory survey, which is a modified version of the Carl Wieman Science Education Initiative Teaching Practices Inventory, was selected for the study. Faculty in the STEM disciplines that participated in a summer learning institute at a 4-year university in the USA constituted the population selected for the study. One of the summer learning institute’s main purpose was to have an impact on the teaching of STEM courses, particularly the teaching of gateway courses taken by many STEM majors. The sample population for the study is 97.5 of the total number of summer learning institute participants. Basic descriptive statistics through the Statistical Package for the Social Sciences (SPSS) were performed to find out: 1) The percentage of faculty using geospatial technologies and geovisualization; 2) Did the faculty associated department impact their use of geospatial tools?; and 3) Did the number of years in a teaching capacity impact their use of geospatial tools? Findings indicate that only 10 percent of respondents had used geospatial technologies, and 18 percent had used geospatial visualization. In addition, the use of geovisualization among faculty of different disciplines was broader than the use of geospatial technologies. The use of geospatial technologies concentrated in the engineering departments. Data seems to indicate the lack of incorporation of geospatial tools in STEM education. The use of geospatial tools is an effective way to engage students in deep STEM learning. Future research should look at the effect on student learning and retention in science and engineering programs when geospatial tools are used.

Keywords: engineering education, geospatial technology, geovisualization, STEM

Procedia PDF Downloads 231
2748 Geospatial Network Analysis Using Particle Swarm Optimization

Authors: Varun Singh, Mainak Bandyopadhyay, Maharana Pratap Singh

Abstract:

The shortest path (SP) problem concerns with finding the shortest path from a specific origin to a specified destination in a given network while minimizing the total cost associated with the path. This problem has widespread applications. Important applications of the SP problem include vehicle routing in transportation systems particularly in the field of in-vehicle Route Guidance System (RGS) and traffic assignment problem (in transportation planning). Well known applications of evolutionary methods like Genetic Algorithms (GA), Ant Colony Optimization, Particle Swarm Optimization (PSO) have come up to solve complex optimization problems to overcome the shortcomings of existing shortest path analysis methods. It has been reported by various researchers that PSO performs better than other evolutionary optimization algorithms in terms of success rate and solution quality. Further Geographic Information Systems (GIS) have emerged as key information systems for geospatial data analysis and visualization. This research paper is focused towards the application of PSO for solving the shortest path problem between multiple points of interest (POI) based on spatial data of Allahabad City and traffic speed data collected using GPS. Geovisualization of results of analysis is carried out in GIS.

Keywords: particle swarm optimization, GIS, traffic data, outliers

Procedia PDF Downloads 460
2747 Impact of Urbanization on Natural Drainage Pattern in District of Larkana, Sindh Pakistan

Authors: Sumaira Zafar, Arjumand Zaidi

Abstract:

During past few years, several floods have adversely affected the areas along lower Indus River. Besides other climate related anomalies, rapidly increasing urbanization and blockage of natural drains due to siltation or encroachments are two other critical causes that may be responsible for these disasters. Due to flat topography of river Indus plains and blockage of natural waterways, drainage of storm water takes time adversely affecting the crop health and soil properties of the area. Government of Sindh is taking a keen interest in revival of natural drainage network in the province and has initiated this work under Sindh Irrigation and Drainage Authority. In this paper, geospatial techniques are used to analyze landuse/land-cover changes of Larkana district over the past three decades (1980-present) and their impact on natural drainage system. Satellite derived Digital Elevation Model (DEM) and topographic sheets (recent and 1950) are used to delineate natural drainage pattern of the district. The urban landuse map developed in this study is further overlaid on drainage line layer to identify the critical areas where the natural floodwater flows are being inhibited by urbanization. Rainfall and flow data are utilized to identify areas of heavy flow, whereas, satellite data including Landsat 7 and Google Earth are used to map previous floods extent and landuse/cover of the study area. Alternatives to natural drainage systems are also suggested wherever possible. The output maps of natural drainage pattern can be used to develop a decision support system for urban planners, Sindh development authorities and flood mitigation and management agencies.

Keywords: geospatial techniques, satellite data, natural drainage, flood, urbanization

Procedia PDF Downloads 488
2746 Examining Litter Distributions in Lethbridge, Alberta, Canada, Using Citizen Science and GIS Methods: OpenLitterMap App and Story Maps

Authors: Tali Neta

Abstract:

Humans’ impact on the environment has been incredibly brutal, with enormous plastic- and other pollutants (e.g., cigarette buds, paper cups, tires) worldwide. On land, litter costs taxpayers a fortune. Most of the litter pollution comes from the land, yet it is one of the greatest hazards to marine environments. Due to spatial and temporal limitations, previous litter data covered very small areas. Currently, smartphones can be used to obtain information on various pollutants (through citizen science), and they can greatly assist in acknowledging and mitigating the environmental impact of litter. Litter app data, such as the Litterati, are available for study through a global map only; these data are not available for download, and it is not clear whether irrelevant hashtags have been eliminated. Instagram and Twitter open-source geospatial data are available for download; however, these are considered inaccurate, computationally challenging, and impossible to quantify. Therefore, the resulting data are of poor quality. Other downloadable geospatial data (e.g., Marine Debris Tracker8 and Clean Swell10) are focused on marine- rather than terrestrial litter. Therefore, accurate terrestrial geospatial documentation of litter distribution is needed to improve environmental awareness. The current research employed citizen science to examine litter distribution in Lethbridge, Alberta, Canada, using the OpenLitterMap (OLM) app. The OLM app is an application used to track litter worldwide, and it can mark litter locations through photo georeferencing, which can be presented through GIS-designed maps. The OLM app provides open-source data that can be downloaded. It also offers information on various litter types and “hot-spots” areas where litter accumulates. In this study, Lethbridge College students collected litter data with the OLM app. The students produced GIS Story Maps (interactive web GIS illustrations) and presented these to school children to improve awareness of litter's impact on environmental health. Preliminary results indicate that towards the Lethbridge Coulees’ (valleys) East edges, the amount of litter significantly increased due to shrubs’ presence, that acted as litter catches. As wind generally travels from west to east in Lethbridge, litter in West-Lethbridge often finds its way down in the east part of the coulees. The students’ documented various litter types, while the majority (75%) included plastic and paper food packaging. The students also found metal wires, broken glass, plastic bottles, golf balls, and tires. Presentations of the Story Maps to school children had a significant impact, as the children voluntarily collected litter during school recess, and they were looking into solutions to reduce litter. Further litter distribution documentation through Citizen Science is needed to improve public awareness. Additionally, future research will be focused on Drone imagery of highly concentrated litter areas. Finally, a time series analysis of litter distribution will help us determine whether public education through Citizen Science and Story Maps can assist in reducing litter and reaching a cleaner and healthier environment.

Keywords: citizen science, litter pollution, Open Litter Map, GIS Story Map

Procedia PDF Downloads 56
2745 A Comparative Study on Automatic Feature Classification Methods of Remote Sensing Images

Authors: Lee Jeong Min, Lee Mi Hee, Eo Yang Dam

Abstract:

Geospatial feature extraction is a very important issue in the remote sensing research. In the meantime, the image classification based on statistical techniques, but, in recent years, data mining and machine learning techniques for automated image processing technology is being applied to remote sensing it has focused on improved results generated possibility. In this study, artificial neural network and decision tree technique is applied to classify the high-resolution satellite images, as compared to the MLC processing result is a statistical technique and an analysis of the pros and cons between each of the techniques.

Keywords: remote sensing, artificial neural network, decision tree, maximum likelihood classification

Procedia PDF Downloads 334
2744 Applications Using Geographic Information System for Planning and Development of Energy Efficient and Sustainable Living for Smart-Cities

Authors: Javed Mohammed

Abstract:

As urbanization process has been and will be happening in an unprecedented scale worldwide, strong requirements from academic research and practical fields for smart management and intelligent planning of cities are pressing to handle increasing demands of infrastructure and potential risks of inhabitants agglomeration in disaster management. Geo-spatial data and Geographic Information System (GIS) are essential components for building smart cities in a basic way that maps the physical world into virtual environment as a referencing framework. On higher level, GIS has been becoming very important in smart cities on different sectors. In the digital city era, digital maps and geospatial databases have long been integrated in workflows in land management, urban planning and transportation in government. People have anticipated GIS to be more powerful not only as an archival and data management tool but also as spatial models for supporting decision-making in intelligent cities. The purpose of this project is to offer observations and analysis based on a detailed discussion of Geographic Information Systems( GIS) driven Framework towards the development of Smart and Sustainable Cities through high penetration of Renewable Energy Technologies.

Keywords: digital maps, geo-spatial, geographic information system, smart cities, renewable energy, urban planning

Procedia PDF Downloads 512
2743 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework

Authors: Lutful Karim, Mohammed S. Al-kahtani

Abstract:

Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.

Keywords: big data, clustering, tree topology, data aggregation, sensor networks

Procedia PDF Downloads 321
2742 Proposal of Data Collection from Probes

Authors: M. Kebisek, L. Spendla, M. Kopcek, T. Skulavik

Abstract:

In our paper we describe the security capabilities of data collection. Data are collected with probes located in the near and distant surroundings of the company. Considering the numerous obstacles e.g. forests, hills, urban areas, the data collection is realized in several ways. The collection of data uses connection via wireless communication, LAN network, GSM network and in certain areas data are collected by using vehicles. In order to ensure the connection to the server most of the probes have ability to communicate in several ways. Collected data are archived and subsequently used in supervisory applications. To ensure the collection of the required data, it is necessary to propose algorithms that will allow the probes to select suitable communication channel.

Keywords: communication, computer network, data collection, probe

Procedia PDF Downloads 341
2741 Algorithms used in Spatial Data Mining GIS

Authors: Vahid Bairami Rad

Abstract:

Extracting knowledge from spatial data like GIS data is important to reduce the data and extract information. Therefore, the development of new techniques and tools that support the human in transforming data into useful knowledge has been the focus of the relatively new and interdisciplinary research area ‘knowledge discovery in databases’. Thus, we introduce a set of database primitives or basic operations for spatial data mining which are sufficient to express most of the spatial data mining algorithms from the literature. This approach has several advantages. Similar to the relational standard language SQL, the use of standard primitives will speed-up the development of new data mining algorithms and will also make them more portable. We introduced a database-oriented framework for spatial data mining which is based on the concepts of neighborhood graphs and paths. A small set of basic operations on these graphs and paths were defined as database primitives for spatial data mining. Furthermore, techniques to efficiently support the database primitives by a commercial DBMS were presented.

Keywords: spatial data base, knowledge discovery database, data mining, spatial relationship, predictive data mining

Procedia PDF Downloads 439
2740 A Data Envelopment Analysis Model in a Multi-Objective Optimization with Fuzzy Environment

Authors: Michael Gidey Gebru

Abstract:

Most of Data Envelopment Analysis models operate in a static environment with input and output parameters that are chosen by deterministic data. However, due to ambiguity brought on shifting market conditions, input and output data are not always precisely gathered in real-world scenarios. Fuzzy numbers can be used to address this kind of ambiguity in input and output data. Therefore, this work aims to expand crisp Data Envelopment Analysis into Data Envelopment Analysis with fuzzy environment. In this study, the input and output data are regarded as fuzzy triangular numbers. Then, the Data Envelopment Analysis model with fuzzy environment is solved using a multi-objective method to gauge the Decision Making Units' efficiency. Finally, the developed Data Envelopment Analysis model is illustrated with an application on real data 50 educational institutions.

Keywords: efficiency, Data Envelopment Analysis, fuzzy, higher education, input, output

Procedia PDF Downloads 31
2739 Protecting the Cloud Computing Data Through the Data Backups

Authors: Abdullah Alsaeed

Abstract:

Virtualized computing and cloud computing infrastructures are no longer fuzz or marketing term. They are a core reality in today’s corporate Information Technology (IT) organizations. Hence, developing an effective and efficient methodologies for data backup and data recovery is required more than any time. The purpose of data backup and recovery techniques are to assist the organizations to strategize the business continuity and disaster recovery approaches. In order to accomplish this strategic objective, a variety of mechanism were proposed in the recent years. This research paper will explore and examine the latest techniques and solutions to provide data backup and restoration for the cloud computing platforms.

Keywords: data backup, data recovery, cloud computing, business continuity, disaster recovery, cost-effective, data encryption.

Procedia PDF Downloads 66
2738 Data-Mining Approach to Analyzing Industrial Process Information for Real-Time Monitoring

Authors: Seung-Lock Seo

Abstract:

This work presents a data-mining empirical monitoring scheme for industrial processes with partially unbalanced data. Measurement data of good operations are relatively easy to gather, but in unusual special events or faults it is generally difficult to collect process information or almost impossible to analyze some noisy data of industrial processes. At this time some noise filtering techniques can be used to enhance process monitoring performance in a real-time basis. In addition, pre-processing of raw process data is helpful to eliminate unwanted variation of industrial process data. In this work, the performance of various monitoring schemes was tested and demonstrated for discrete batch process data. It showed that the monitoring performance was improved significantly in terms of monitoring success rate of given process faults.

Keywords: data mining, process data, monitoring, safety, industrial processes

Procedia PDF Downloads 379
2737 A Method of Detecting the Difference in Two States of Brain Using Statistical Analysis of EEG Raw Data

Authors: Digvijaysingh S. Bana, Kiran R. Trivedi

Abstract:

This paper introduces various methods for the alpha wave to detect the difference between two states of brain. One healthy subject participated in the experiment. EEG was measured on the forehead above the eye (FP1 Position) with reference and ground electrode are on the ear clip. The data samples are obtained in the form of EEG raw data. The time duration of reading is of one minute. Various test are being performed on the alpha band EEG raw data.The readings are performed in different time duration of the entire day. The statistical analysis is being carried out on the EEG sample data in the form of various tests.

Keywords: electroencephalogram(EEG), biometrics, authentication, EEG raw data

Procedia PDF Downloads 448
2736 Framework for Integrating Big Data and Thick Data: Understanding Customers Better

Authors: Nikita Valluri, Vatcharaporn Esichaikul

Abstract:

With the popularity of data-driven decision making on the rise, this study focuses on providing an alternative outlook towards the process of decision-making. Combining quantitative and qualitative methods rooted in the social sciences, an integrated framework is presented with a focus on delivering a much more robust and efficient approach towards the concept of data-driven decision-making with respect to not only Big data but also 'Thick data', a new form of qualitative data. In support of this, an example from the retail sector has been illustrated where the framework is put into action to yield insights and leverage business intelligence. An interpretive approach to analyze findings from both kinds of quantitative and qualitative data has been used to glean insights. Using traditional Point-of-sale data as well as an understanding of customer psychographics and preferences, techniques of data mining along with qualitative methods (such as grounded theory, ethnomethodology, etc.) are applied. This study’s final goal is to establish the framework as a basis for providing a holistic solution encompassing both the Big and Thick aspects of any business need. The proposed framework is a modified enhancement in lieu of traditional data-driven decision-making approach, which is mainly dependent on quantitative data for decision-making.

Keywords: big data, customer behavior, customer experience, data mining, qualitative methods, quantitative methods, thick data

Procedia PDF Downloads 138
2735 Geospatial Techniques for Impact Assessment of Canal Rehabilitation Program in Sindh, Pakistan

Authors: Sumaira Zafar, Arjumand Zaidi, Muhammad Arslan Hafeez

Abstract:

Indus Basin Irrigation System (IBIS) is the largest contiguous irrigation system of the world comprising Indus River and its tributaries, canals, distributaries, and watercourses. A big challenge faced by IBIS is transmission losses through seepage and leaks that account to 41 percent of the total water derived from the river and about 40 percent of that is through watercourses. Irrigation system rehabilitation programs in Pakistan are focused on improvement of canal system at the watercourse level (tertiary channels). Under these irrigation system management programs more than 22,800 watercourses have been improved or lined out of 43,000 (12,900 Kilometers) watercourses. The evaluation of the improvement work is required at this stage to testify the success of the programs. In this paper, emerging technologies of GIS and satellite remote sensing are used for impact assessment of watercourse rehabilitation work in Sindh. To evaluate the efficiency of the improved watercourses, few parameters are selected like soil moisture along watercourses, availability of water at tail end and changes in cultivable command areas. Improved watercourses details and maps are acquired from National Program for Improvement of Watercourses (NPIW) and Space and Upper Atmospheric Research Commission (SUPARCO). High resolution satellite images of Google Earth for the year of 2004 to 2013 are used for digitizing command areas. Temporal maps of cultivable command areas show a noticeable increase in the cultivable land served by improved watercourses. Field visits are conducted to validate the results. Interviews with farmers and landowners also reveal their overall satisfaction in terms of availability of water at the tail end and increased crop production.

Keywords: geospatial, impact assessment, watercourses, GIS, remote sensing, seepage, canal lining

Procedia PDF Downloads 329
2734 Cloud Computing in Data Mining: A Technical Survey

Authors: Ghaemi Reza, Abdollahi Hamid, Dashti Elham

Abstract:

Cloud computing poses a diversity of challenges in data mining operation arising out of the dynamic structure of data distribution as against the use of typical database scenarios in conventional architecture. Due to immense number of users seeking data on daily basis, there is a serious security concerns to cloud providers as well as data providers who put their data on the cloud computing environment. Big data analytics use compute intensive data mining algorithms (Hidden markov, MapReduce parallel programming, Mahot Project, Hadoop distributed file system, K-Means and KMediod, Apriori) that require efficient high performance processors to produce timely results. Data mining algorithms to solve or optimize the model parameters. The challenges that operation has to encounter is the successful transactions to be established with the existing virtual machine environment and the databases to be kept under the control. Several factors have led to the distributed data mining from normal or centralized mining. The approach is as a SaaS which uses multi-agent systems for implementing the different tasks of system. There are still some problems of data mining based on cloud computing, including design and selection of data mining algorithms.

Keywords: cloud computing, data mining, computing models, cloud services

Procedia PDF Downloads 456
2733 Efficient Positioning of Data Aggregation Point for Wireless Sensor Network

Authors: Sifat Rahman Ahona, Rifat Tasnim, Naima Hassan

Abstract:

Data aggregation is a helpful technique for reducing the data communication overhead in wireless sensor network. One of the important tasks of data aggregation is positioning of the aggregator points. There are a lot of works done on data aggregation. But, efficient positioning of the aggregators points is not focused so much. In this paper, authors are focusing on the positioning or the placement of the aggregation points in wireless sensor network. Authors proposed an algorithm to select the aggregators positions for a scenario where aggregator nodes are more powerful than sensor nodes.

Keywords: aggregation point, data communication, data aggregation, wireless sensor network

Procedia PDF Downloads 142
2732 Spatial Econometric Approaches for Count Data: An Overview and New Directions

Authors: Paula Simões, Isabel Natário

Abstract:

This paper reviews a number of theoretical aspects for implementing an explicit spatial perspective in econometrics for modelling non-continuous data, in general, and count data, in particular. It provides an overview of the several spatial econometric approaches that are available to model data that are collected with reference to location in space, from the classical spatial econometrics approaches to the recent developments on spatial econometrics to model count data, in a Bayesian hierarchical setting. Considerable attention is paid to the inferential framework, necessary for structural consistent spatial econometric count models, incorporating spatial lag autocorrelation, to the corresponding estimation and testing procedures for different assumptions, to the constrains and implications embedded in the various specifications in the literature. This review combines insights from the classical spatial econometrics literature as well as from hierarchical modeling and analysis of spatial data, in order to look for new possible directions on the processing of count data, in a spatial hierarchical Bayesian econometric context.

Keywords: spatial data analysis, spatial econometrics, Bayesian hierarchical models, count data

Procedia PDF Downloads 572
2731 Automated Test Data Generation For some types of Algorithm

Authors: Hitesh Tahbildar

Abstract:

The cost of test data generation for a program is computationally very high. In general case, no algorithm to generate test data for all types of algorithms has been found. The cost of generating test data for different types of algorithm is different. Till date, people are emphasizing the need to generate test data for different types of programming constructs rather than different types of algorithms. The test data generation methods have been implemented to find heuristics for different types of algorithms. Some algorithms that includes divide and conquer, backtracking, greedy approach, dynamic programming to find the minimum cost of test data generation have been tested. Our experimental results say that some of these types of algorithm can be used as a necessary condition for selecting heuristics and programming constructs are sufficient condition for selecting our heuristics. Finally we recommend the different heuristics for test data generation to be selected for different types of algorithms.

Keywords: ongest path, saturation point, lmax, kL, kS

Procedia PDF Downloads 387
2730 PEINS: A Generic Compression Scheme Using Probabilistic Encoding and Irrational Number Storage

Authors: P. Jayashree, S. Rajkumar

Abstract:

With social networks and smart devices generating a multitude of data, effective data management is the need of the hour for networks and cloud applications. Some applications need effective storage while some other applications need effective communication over networks and data reduction comes as a handy solution to meet out both requirements. Most of the data compression techniques are based on data statistics and may result in either lossy or lossless data reductions. Though lossy reductions produce better compression ratios compared to lossless methods, many applications require data accuracy and miniature details to be preserved. A variety of data compression algorithms does exist in the literature for different forms of data like text, image, and multimedia data. In the proposed work, a generic progressive compression algorithm, based on probabilistic encoding, called PEINS is projected as an enhancement over irrational number stored coding technique to cater to storage issues of increasing data volumes as a cost effective solution, which also offers data security as a secondary outcome to some extent. The proposed work reveals cost effectiveness in terms of better compression ratio with no deterioration in compression time.

Keywords: compression ratio, generic compression, irrational number storage, probabilistic encoding

Procedia PDF Downloads 271
2729 Adaptive Data Approximations Codec (ADAC) for AI/ML-based Cyber-Physical Systems

Authors: Yong-Kyu Jung

Abstract:

The fast growth in information technology has led to de-mands to access/process data. CPSs heavily depend on the time of hardware/software operations and communication over the network (i.e., real-time/parallel operations in CPSs (e.g., autonomous vehicles). Since data processing is an im-portant means to overcome the issue confronting data management, reducing the gap between the technological-growth and the data-complexity and channel-bandwidth. An adaptive perpetual data approximation method is intro-duced to manage the actual entropy of the digital spectrum. An ADAC implemented as an accelerator and/or apps for servers/smart-connected devices adaptively rescales digital contents (avg.62.8%), data processing/access time/energy, encryption/decryption overheads in AI/ML applications (facial ID/recognition).

Keywords: adaptive codec, AI, ML, HPC, cyber-physical, cybersecurity

Procedia PDF Downloads 65
2728 Fuzzy Expert Systems Applied to Intelligent Design of Data Centers

Authors: Mario M. Figueroa de la Cruz, Claudia I. Solorzano, Raul Acosta, Ignacio Funes

Abstract:

This technological development project seeks to create a tool that allows companies, in need of implementing a Data Center, intelligently determining factors for allocating resources support cooling and power supply (UPS) in its conception. The results should show clearly the speed, robustness and reliability of a system designed for deployment in environments where they must manage and protect large volumes of data.

Keywords: telecommunications, data center, fuzzy logic, expert systems

Procedia PDF Downloads 331
2727 Cloud Data Security Using Map/Reduce Implementation of Secret Sharing Schemes

Authors: Sara Ibn El Ahrache, Tajje-eddine Rachidi, Hassan Badir, Abderrahmane Sbihi

Abstract:

Recently, there has been increasing confidence for a favorable usage of big data drawn out from the huge amount of information deposited in a cloud computing system. Data kept on such systems can be retrieved through the network at the user’s convenience. However, the data that users send include private information, and therefore, information leakage from these data is now a major social problem. The usage of secret sharing schemes for cloud computing have lately been approved to be relevant in which users deal out their data to several servers. Notably, in a (k,n) threshold scheme, data security is assured if and only if all through the whole life of the secret the opponent cannot compromise more than k of the n servers. In fact, a number of secret sharing algorithms have been suggested to deal with these security issues. In this paper, we present a Mapreduce implementation of Shamir’s secret sharing scheme to increase its performance and to achieve optimal security for cloud data. Different tests were run and through it has been demonstrated the contributions of the proposed approach. These contributions are quite considerable in terms of both security and performance.

Keywords: cloud computing, data security, Mapreduce, Shamir's secret sharing

Procedia PDF Downloads 283
2726 Agile Methodology for Modeling and Design of Data Warehouses -AM4DW-

Authors: Nieto Bernal Wilson, Carmona Suarez Edgar

Abstract:

The organizations have structured and unstructured information in different formats, sources, and systems. Part of these come from ERP under OLTP processing that support the information system, however these organizations in OLAP processing level, presented some deficiencies, part of this problematic lies in that does not exist interesting into extract knowledge from their data sources, as also the absence of operational capabilities to tackle with these kind of projects.  Data Warehouse and its applications are considered as non-proprietary tools, which are of great interest to business intelligence, since they are repositories basis for creating models or patterns (behavior of customers, suppliers, products, social networks and genomics) and facilitate corporate decision making and research. The following paper present a structured methodology, simple, inspired from the agile development models as Scrum, XP and AUP. Also the models object relational, spatial data models, and the base line of data modeling under UML and Big data, from this way sought to deliver an agile methodology for the developing of data warehouses, simple and of easy application. The methodology naturally take into account the application of process for the respectively information analysis, visualization and data mining, particularly for patterns generation and derived models from the objects facts structured.

Keywords: data warehouse, model data, big data, object fact, object relational fact, process developed data warehouse

Procedia PDF Downloads 390
2725 Identifying Model to Predict Deterioration of Water Mains Using Robust Analysis

Authors: Go Bong Choi, Shin Je Lee, Sung Jin Yoo, Gibaek Lee, Jong Min Lee

Abstract:

In South Korea, it is difficult to obtain data for statistical pipe assessment. In this paper, to address these issues, we find that various statistical model presented before is how data mixed with noise and are whether apply in South Korea. Three major type of model is studied and if data is presented in the paper, we add noise to data, which affects how model response changes. Moreover, we generate data from model in paper and analyse effect of noise. From this we can find robustness and applicability in Korea of each model.

Keywords: proportional hazard model, survival model, water main deterioration, ecological sciences

Procedia PDF Downloads 725
2724 Impact of Stack Caches: Locality Awareness and Cost Effectiveness

Authors: Abdulrahman K. Alshegaifi, Chun-Hsi Huang

Abstract:

Treating data based on its location in memory has received much attention in recent years due to its different properties, which offer important aspects for cache utilization. Stack data and non-stack data may interfere with each other’s locality in the data cache. One of the important aspects of stack data is that it has high spatial and temporal locality. In this work, we simulate non-unified cache design that split data cache into stack and non-stack caches in order to maintain stack data and non-stack data separate in different caches. We observe that the overall hit rate of non-unified cache design is sensitive to the size of non-stack cache. Then, we investigate the appropriate size and associativity for stack cache to achieve high hit ratio especially when over 99% of accesses are directed to stack cache. The result shows that on average more than 99% of stack cache accuracy is achieved by using 2KB of capacity and 1-way associativity. Further, we analyze the improvement in hit rate when adding small, fixed, size of stack cache at level1 to unified cache architecture. The result shows that the overall hit rate of unified cache design with adding 1KB of stack cache is improved by approximately, on average, 3.9% for Rijndael benchmark. The stack cache is simulated by using SimpleScalar toolset.

Keywords: hit rate, locality of program, stack cache, stack data

Procedia PDF Downloads 286
2723 Finding Bicluster on Gene Expression Data of Lymphoma Based on Singular Value Decomposition and Hierarchical Clustering

Authors: Alhadi Bustaman, Soeganda Formalidin, Titin Siswantining

Abstract:

DNA microarray technology is used to analyze thousand gene expression data simultaneously and a very important task for drug development and test, function annotation, and cancer diagnosis. Various clustering methods have been used for analyzing gene expression data. However, when analyzing very large and heterogeneous collections of gene expression data, conventional clustering methods often cannot produce a satisfactory solution. Biclustering algorithm has been used as an alternative approach to identifying structures from gene expression data. In this paper, we introduce a transform technique based on singular value decomposition to identify normalized matrix of gene expression data followed by Mixed-Clustering algorithm and the Lift algorithm, inspired in the node-deletion and node-addition phases proposed by Cheng and Church based on Agglomerative Hierarchical Clustering (AHC). Experimental study on standard datasets demonstrated the effectiveness of the algorithm in gene expression data.

Keywords: agglomerative hierarchical clustering (AHC), biclustering, gene expression data, lymphoma, singular value decomposition (SVD)

Procedia PDF Downloads 262
2722 An Efficient Traceability Mechanism in the Audited Cloud Data Storage

Authors: Ramya P, Lino Abraham Varghese, S. Bose

Abstract:

By cloud storage services, the data can be stored in the cloud, and can be shared across multiple users. Due to the unexpected hardware/software failures and human errors, which make the data stored in the cloud be lost or corrupted easily it affected the integrity of data in cloud. Some mechanisms have been designed to allow both data owners and public verifiers to efficiently audit cloud data integrity without retrieving the entire data from the cloud server. But public auditing on the integrity of shared data with the existing mechanisms will unavoidably reveal confidential information such as identity of the person, to public verifiers. Here a privacy-preserving mechanism is proposed to support public auditing on shared data stored in the cloud. It uses group signatures to compute verification metadata needed to audit the correctness of shared data. The identity of the signer on each block in shared data is kept confidential from public verifiers, who are easily verifying shared data integrity without retrieving the entire file. But on demand, the signer of the each block is reveal to the owner alone. Group private key is generated once by the owner in the static group, where as in the dynamic group, the group private key is change when the users revoke from the group. When the users leave from the group the already signed blocks are resigned by cloud service provider instead of owner is efficiently handled by efficient proxy re-signature scheme.

Keywords: data integrity, dynamic group, group signature, public auditing

Procedia PDF Downloads 374
2721 Customer Churn Analysis in Telecommunication Industry Using Data Mining Approach

Authors: Burcu Oralhan, Zeki Oralhan, Nilsun Sariyer, Kumru Uyar

Abstract:

Data mining has been becoming more and more important and a wide range of applications in recent years. Data mining is the process of find hidden and unknown patterns in big data. One of the applied fields of data mining is Customer Relationship Management. Understanding the relationships between products and customers is crucial for every business. Customer Relationship Management is an approach to focus on customer relationship development, retention and increase on customer satisfaction. In this study, we made an application of a data mining methods in telecommunication customer relationship management side. This study aims to determine the customers profile who likely to leave the system, develop marketing strategies, and customized campaigns for customers. Data are clustered by applying classification techniques for used to determine the churners. As a result of this study, we will obtain knowledge from international telecommunication industry. We will contribute to the understanding and development of this subject in Customer Relationship Management.

Keywords: customer churn analysis, customer relationship management, data mining, telecommunication industry

Procedia PDF Downloads 295
2720 Sequential Data Assimilation with High-Frequency (HF) Radar Surface Current

Authors: Lei Ren, Michael Hartnett, Stephen Nash

Abstract:

The abundant measured surface current from HF radar system in coastal area is assimilated into model to improve the modeling forecasting ability. A simple sequential data assimilation scheme, Direct Insertion (DI), is applied to update model forecast states. The influence of Direct Insertion data assimilation over time is analyzed at one reference point. Vector maps of surface current from models are compared with HF radar measurements. Root-Mean-Squared-Error (RMSE) between modeling results and HF radar measurements is calculated during the last four days with no data assimilation.

Keywords: data assimilation, CODAR, HF radar, surface current, direct insertion

Procedia PDF Downloads 552