Search results for: big data ecosystem
7512 Regression Approach for Optimal Purchase of Hosts Cluster in Fixed Fund for Hadoop Big Data Platform
Authors: Haitao Yang, Jianming Lv, Fei Xu, Xintong Wang, Yilin Huang, Lanting Xia, Xuewu Zhu
Abstract:
Given a fixed fund, purchasing fewer hosts of higher capability or inversely more of lower capability is a must-be-made trade-off in practices for building a Hadoop big data platform. An exploratory study is presented for a Housing Big Data Platform project (HBDP), where typical big data computing is with SQL queries of aggregate, join, and space-time condition selections executed upon massive data from more than 10 million housing units. In HBDP, an empirical formula was introduced to predict the performance of host clusters potential for the intended typical big data computing, and it was shaped via a regression approach. With this empirical formula, it is easy to suggest an optimal cluster configuration. The investigation was based on a typical Hadoop computing ecosystem HDFS+Hive+Spark. A proper metric was raised to measure the performance of Hadoop clusters in HBDP, which was tested and compared with its predicted counterpart, on executing three kinds of typical SQL query tasks. Tests were conducted with respect to factors of CPU benchmark, memory size, virtual host division, and the number of element physical host in cluster. The research has been applied to practical cluster procurement for housing big data computing.
Keywords: Hadoop platform planning, optimal cluster scheme at fixed-fund, performance empirical formula, typical SQL query tasks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8377511 Hybrid Methods for Optimisation of Weights in Spatial Multi-Criteria Evaluation Decision for Fire Risk and Hazard
Authors: I. Yakubu, D. Mireku-Gyimah, D. Asafo-Adjei
Abstract:
The challenge for everyone involved in preserving the ecosystem is to find creative ways to protect and restore the remaining ecosystems while accommodating and enhancing the country social and economic well-being. Frequent fires of anthropogenic origin have been affecting the ecosystems in many countries adversely. Hence adopting ways of decision making such as Multicriteria Decision Making (MCDM) is appropriate since it will enhance the evaluation and analysis of fire risk and hazard of the ecosystem. In this paper, fire risk and hazard data from the West Gonja area of Ghana were used in some of the methods (Analytical Hierarchy Process, Compromise Programming, and Grey Relational Analysis (GRA) for MCDM evaluation and analysis to determine the optimal weight method for fire risk and hazard. Ranking of the land cover types was carried out using; Fire Hazard, Fire Fighting Capacity and Response Risk Criteria. Pairwise comparison under Analytic Hierarchy Process (AHP) was used to determine the weight of the various criteria. Weights for sub-criteria were also obtained by the pairwise comparison method. The results were optimised using GRA and Compromise Programming (CP). The results from each method, hybrid GRA and CP, were compared and it was established that all methods were satisfactory in terms of optimisation of weight. The most optimal method for spatial multicriteria evaluation was the hybrid GRA method. Thus, a hybrid AHP and GRA method is more effective method for ranking alternatives in MCDM than the hybrid AHP and CP method.
Keywords: Compromise programming, grey relational analysis, spatial multi-criteria, weight optimisation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6567510 Combining Fuzzy Logic and Neural Networks in Modeling Landfill Gas Production
Authors: Mohamed Abdallah, Mostafa Warith, Roberto Narbaitz, Emil Petriu, Kevin Kennedy
Abstract:
Heterogeneity of solid waste characteristics as well as the complex processes taking place within the landfill ecosystem motivated the implementation of soft computing methodologies such as artificial neural networks (ANN), fuzzy logic (FL), and their combination. The present work uses a hybrid ANN-FL model that employs knowledge-based FL to describe the process qualitatively and implements the learning algorithm of ANN to optimize model parameters. The model was developed to simulate and predict the landfill gas production at a given time based on operational parameters. The experimental data used were compiled from lab-scale experiment that involved various operating scenarios. The developed model was validated and statistically analyzed using F-test, linear regression between actual and predicted data, and mean squared error measures. Overall, the simulated landfill gas production rates demonstrated reasonable agreement with actual data. The discussion focused on the effect of the size of training datasets and number of training epochs.
Keywords: Adaptive neural fuzzy inference system (ANFIS), gas production, landfill
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24127509 Prediction of Dissolved Oxygen in Rivers Using a Wang-Mendel Method – Case Study of Au Sable River
Authors: Mahmoud R. Shaghaghian
Abstract:
Amount of dissolve oxygen in a river has a great direct affect on aquatic macroinvertebrates and this would influence on the region ecosystem indirectly. In this paper it is tried to predict dissolved oxygen in rivers by employing an easy Fuzzy Logic Modeling, Wang Mendel method. This model just uses previous records to estimate upcoming values. For this purpose daily and hourly records of eight stations in Au Sable watershed in Michigan, United States are employed for 12 years and 50 days period respectively. Calculations indicate that for long period prediction it is better to increase input intervals. But for filling missed data it is advisable to decrease the interval. Increasing partitioning of input and output features influence a little on accuracy but make the model too time consuming. Increment in number of input data also act like number of partitioning. Large amount of train data does not modify accuracy essentially, so, an optimum training length should be selected.
Keywords: Dissolved oxygen, Au Sable, fuzzy logic modeling, Wang Mendel.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18907508 Surface Elevation Dynamics Assessment Using Digital Elevation Models, Light Detection and Ranging, GPS and Geospatial Information Science Analysis: Ecosystem Modelling Approach
Authors: Ali K. M. Al-Nasrawi, Uday A. Al-Hamdany, Sarah M. Hamylton, Brian G. Jones, Yasir M. Alyazichi
Abstract:
Surface elevation dynamics have always responded to disturbance regimes. Creating Digital Elevation Models (DEMs) to detect surface dynamics has led to the development of several methods, devices and data clouds. DEMs can provide accurate and quick results with cost efficiency, in comparison to the inherited geomatics survey techniques. Nowadays, remote sensing datasets have become a primary source to create DEMs, including LiDAR point clouds with GIS analytic tools. However, these data need to be tested for error detection and correction. This paper evaluates various DEMs from different data sources over time for Apple Orchard Island, a coastal site in southeastern Australia, in order to detect surface dynamics. Subsequently, 30 chosen locations were examined in the field to test the error of the DEMs surface detection using high resolution global positioning systems (GPSs). Results show significant surface elevation changes on Apple Orchard Island. Accretion occurred on most of the island while surface elevation loss due to erosion is limited to the northern and southern parts. Concurrently, the projected differential correction and validation method aimed to identify errors in the dataset. The resultant DEMs demonstrated a small error ratio (≤ 3%) from the gathered datasets when compared with the fieldwork survey using RTK-GPS. As modern modelling approaches need to become more effective and accurate, applying several tools to create different DEMs on a multi-temporal scale would allow easy predictions in time-cost-frames with more comprehensive coverage and greater accuracy. With a DEM technique for the eco-geomorphic context, such insights about the ecosystem dynamic detection, at such a coastal intertidal system, would be valuable to assess the accuracy of the predicted eco-geomorphic risk for the conservation management sustainability. Demonstrating this framework to evaluate the historical and current anthropogenic and environmental stressors on coastal surface elevation dynamism could be profitably applied worldwide.
Keywords: DEMs, eco-geomorphic-dynamic processes, geospatial information science. Remote sensing, surface elevation changes.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11587507 Innovation Ecosystems in the Construction Industry
Authors: Cansu Gülser, Tuğce Ercan
Abstract:
The construction sector is a key driver of the global economy, contributing significantly to growth and employment through a diverse array of sub-sectors. However, it faces challenges due to its project-based nature, which often hampers long-term collaboration and broader incentives beyond individual projects. These limitations are frequently discussed in scientific literature as obstacles to innovation and industry-wide change. Traditional practices and unwritten rules further hinder the adoption of new processes within the construction industry. The disadvantages of the construction industry’s project-based structure in fostering innovation and long-term relationships include limited continuity, fragmented collaborations, and a focus on short-term goals, which collectively hinder the development of sustained partnerships, inhibit the sharing of knowledge and best practices, and reduce incentives for investing in innovative processes and technologies. This structure typically emphasizes specific projects, which restricts broader collaborations and incentives that extend beyond individual projects, thus impeding innovation and change. The temporal complexities inherent in project-based sectors like construction make it difficult to address societal challenges through collaborative efforts. Traditional management approaches are inadequate for scaling up innovations and adapting to significant changes. For systemic transformation in the construction sector, there is a need for more collaborative relationships and activities beyond traditional supply chains. This study delves into the concept of an innovation ecosystem within the construction sector, highlighting various research findings. It aims to explore key questions about the components that enhance innovation capacity, the relationship between a robust innovation ecosystem and this capacity, and the reasons why innovation is less prevalent and implemented in this sector compared to others. Additionally, it examines the main factors hindering innovation within companies and identifies strategies to improve these efforts, particularly in developing countries. The innovation ecosystem in the construction sector generates various outputs through interactions between business resources and external components. These outputs include innovative value creation, sustainable practices, robust collaborations, knowledge sharing, competitiveness, and advanced project management, all of which contribute significantly to company market performance and competitive advantage. This article offers insights and strategic recommendations for industry professionals, policymakers, and researchers interested in developing and sustaining innovation ecosystems in the construction sector. Future research should focus on broader samples for generalization, comparative sector analysis, and application-focused studies addressing real industry challenges. Additionally, studying the long-term impacts of innovation ecosystems, integrating advanced technologies like AI and machine learning into project management, and developing future application strategies and policies are also important.
Keywords: Construction industry, innovation ecosystem, innovation ecosystem components, project management.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 917506 Observer Design for Ecological Monitoring
Authors: I. López , J. Garay, R. Carreño, Z. Varga
Abstract:
Monitoring of ecological systems is one of the major issues in ecosystem research. The concepts and methodology of mathematical systems theory provide useful tools to face this problem. In many cases, state monitoring of a complex ecological system consists in observation (measurement) of certain state variables, and the whole state process has to be determined from the observed data. The solution proposed in the paper is the design of an observer system, which makes it possible to approximately recover the state process from its partial observation. The method is illustrated with a trophic chain of resource – producer – primary consumer type and a numerical example is also presented.Keywords: Monitoring, observer system, trophic chain
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14317505 Insight-Based Evaluation of a Map-based Dashboard
Authors: Anna Fredriksson Häägg, Charlotte Weil, Niklas Rönnberg
Abstract:
Map-based dashboards are used for data exploration every day. The present study used an insight-based methodology for evaluating a map-based dashboard that presents research findings of water management and ecosystem services in the Amazon. In addition to analyzing the insights gained from using the dashboard, the evaluation method was compared to standardized questionnaires and task-based evaluations. The result suggests that the dashboard enabled the participants to gain domain-relevant, complex insights regarding the topic presented. Furthermore, the insight-based analysis highlighted unexpected insights and hypotheses regarding causes and potential adaptation strategies for remediation. Although time- and resource-consuming, the insight-based methodology was shown to have the potential of thoroughly analyzing how end users can utilize map-based dashboards for data exploration and decision making. Finally, the insight-based methodology is argued to evaluate tools in scenarios more similar to real-life usage, compared to task-based evaluation methods.
Keywords: Visual analytics, dashboard, insight-based evaluation, geographic visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4087504 Necessity of Risk Management of Various Industry-Associated Pollutants(Case Study of Gavkhoni Wetland Ecosystem)
Authors: Hekmatpanah, M.
Abstract:
Since the beginning of human history, human activities have caused many changes in the environment. Today, a particular attention should be paid to gaining knowledge about water quality of wetlands which are pristine natural environments rich in genetic reserves. If qualitative conditions of industrial areas (in terms of both physicochemical and biological conditions) are not addressed properly, they could cause disruption in natural ecosystems, especially in rivers. With regards to the quality of water resources, determination of pollutant sources plays a pivotal role in engineering projects as well as designing water quality control systems. Thus, using different methods such as flow duration curves, dischargepollution load model and frequency analysis by HYFA software package, risk of various industrial pollutants in international and ecologically important Gavkhoni wetland is analyzed. In this study, a station located at Varzaneh City is used as the last station on Zayanderud River, from where the river water is discharged into the wetland. Results showed that elements- concentrations often exceeded the allowed level and river water can endanger regional ecosystem. In addition, if the river discharge is managed on Q25 basis, this basis can lower concentrations of elements, keeping them within the normal level.Keywords: Pollutants Risk, Industry, Flow Discharge, Management, Gavkhoni Wetland
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12457503 A Proposal for a Secure and Interoperable Data Framework for Energy Digitalization
Authors: Hebberly Ahatlan
Abstract:
The process of digitizing energy systems involves transforming traditional energy infrastructure into interconnected, data-driven systems that enhance efficiency, sustainability, and responsiveness. As smart grids become increasingly integral to the efficient distribution and management of electricity from both fossil and renewable energy sources, the energy industry faces strategic challenges associated with digitalization and interoperability — particularly in the context of modern energy business models, such as virtual power plants (VPPs). The critical challenge in modern smart grids is to seamlessly integrate diverse technologies and systems, including virtualization, grid computing and service-oriented architecture (SOA), across the entire energy ecosystem. Achieving this requires addressing issues like semantic interoperability, Information Technology (IT) and Operational Technology (OT) convergence, and digital asset scalability, all while ensuring security and risk management. This paper proposes a four-layer digitalization framework to tackle these challenges, encompassing persistent data protection, trusted key management, secure messaging, and authentication of IoT resources. Data assets generated through this framework enable AI systems to derive insights for improving smart grid operations, security, and revenue generation. Furthermore, this paper also proposes a Trusted Energy Interoperability Alliance as a universal guiding standard in the development of this digitalization framework to support more dynamic and interoperable energy markets.
Keywords: Digitalization, IT/OT convergence, semantic interoperability, TEIA alliance, VPP.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1167502 A Real Time Development Study for Automated Centralized Remote Monitoring System at Royal Belum Forest
Authors: Amri Yusoff, Shahrizuan Shafiril, Ashardi Abas, Norma Che Yusoff
Abstract:
Nowadays, illegal logging has been causing many effects including flash flood, avalanche, global warming, and etc. The purpose of this study was to maintain the earth ecosystem by keeping and regulate Malaysia’s treasurable rainforest by utilizing a new technology that will assist in real-time alert and give faster response to the authority to act on these illegal activities. The methodology of this research consisted of design stages that have been conducted as well as the system model and system architecture of the prototype in addition to the proposed hardware and software that have been mainly used such as microcontroller, sensor with the implementation of GSM, and GPS integrated system. This prototype was deployed at Royal Belum forest in December 2014 for phase 1 and April 2015 for phase 2 at 21 pinpoint locations. The findings of this research were the capture of data in real-time such as temperature, humidity, gaseous, fire, and rain detection which indicate the current natural state and habitat in the forest. Besides, this device location can be detected via GPS of its current location and then transmitted by SMS via GSM system. All of its readings were sent in real-time for further analysis. The data that were compared to meteorological department showed that the precision of this device was about 95% and these findings proved that the system is acceptable and suitable to be used in the field.Keywords: Remote monitoring system, forest data, GSM, GPS, wireless sensor.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16217501 Surface Water Pollution by Open Refuse Dumpsite in North Central of Nigeria
Authors: Abimbola Motunrayo Folami, Ibironke Titilayo Enitan, Feroz Mohomed Swalaha
Abstract:
Water is a vital resource that is important in ensuring the growth and development of any country. To sustain the basic human needs and the demands for agriculture, industry, conservational and ecosystem, enough quality and quantity water is needed. Contamination of water resources is now a global and public health concern. Hence, this study assessed the water quality of Ndawuse River by measuring the physicochemical parameters and heavy metals concentrations of the river using standard methods. In total, 16 surface water samples were obtained from five locations along the river, from upstream to downstream as well as samples from the dumpsite. The results obtained were compared with the standard limits set by both the World Health Organization and the Federal Environmental Protection Agency for domestic purposes. The results of the measured parameters indicated that biological oxygen demand (85.88 mg/L), turbidity (44.51 NTU), Iron (0.014 - 3.511 mg /L) and chromium (0.078 - 0.14 mg /L) were all above the standard limits. The results further showed that the quality of surface water is being significantly affected by human activities around the Ndawuse River which could pose an adverse health risk to several communities that rely on this river as their primary source of water. Therefore, there is a need for strict enforcement of environmental laws to protect the aquatic ecosystem and to avoid long term cumulative exposure risk that heavy metals may pose on human health.
Keywords: Abuja, contaminants, heavy metals, Ndawuse River, Nigeria, surface water.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6037500 Big Data: Big Challenges to Privacy and Data Protection
Authors: Abu Bakar Munir, Siti Hajar Mohd Yasin, Firdaus Muhammad-Sukki
Abstract:
This paper seeks to analyse the benefits of big data and more importantly the challenges it pose to the subject of privacy and data protection. First, the nature of big data will be briefly deliberated before presenting the potential of big data in the present days. Afterwards, the issue of privacy and data protection is highlighted before discussing the challenges of implementing this issue in big data. In conclusion, the paper will put forward the debate on the adequacy of the existing legal framework in protecting personal data in the era of big data.
Keywords: Big data, data protection, information, privacy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 39247499 Designing an Editorialization Environment for Repeatable Self-Correcting Exercises
Authors: M. Kobylanski, D. Buskulic, P.-H. Duron, D. Revuz, F. Ruggieri, E. Sandier, C. Tijus
Abstract:
In order to design a cooperative e-learning platform, we observed teams of Teacher [T], Computer Scientist [CS] and exerciser's programmer-designer [ED] cooperating for the conception of a self-correcting exercise, but without the use of such a device in order to catch the kind of interactions a useful platform might provide. To do so, we first run a task analysis on how T, CS and ED should be cooperating in order to achieve, at best, the task of creating and implementing self-directed, self-paced, repeatable self-correcting exercises (RSE) in the context of open educational resources. The formalization of the whole process was based on the “objectives, activities and evaluations” theory of educational task analysis. Second, using the resulting frame as a “how-to-do it” guide, we run a series of three contrasted Hackathon of RSE-production to collect data about the cooperative process that could be later used to design the collaborative e-learning platform. Third, we used two complementary methods to collect, to code and to analyze the adequate survey data: the directional flow of interaction among T-CS-ED experts holding a functional role, and the Means-End Problem Solving analysis. Fourth, we listed the set of derived recommendations useful for the design of the exerciser as a cooperative e-learning platform. Final recommendations underline the necessity of building (i) an ecosystem that allows to sustain teams of T-CS-ED experts, (ii) a data safety platform although offering accessibility and open discussion about the production of exercises with their resources and (iii) a good architecture allowing the inheritance of parts of the coding of any exercise already in the data base as well as fast implementation of new kinds of exercises along with their associated learning activities.
Keywords: Distance open educational resources, pedagogical alignment, self-correcting exercises, teacher’s involvement, team roles.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5177498 The Nematode Fauna Dynamics Peculiarities of Highlands Different Ecosystems (Eastern Georgia)
Authors: E. Tskitishvili, I. Eliava, T. Tskitishvili, N. Bagathuria, L. Zghenti, M. Gigolashvili
Abstract:
There was studied dynamic of the number of nematodes fauna of various ecosystems of Gombori Mountain Ridge that belongs to peak of fauna dynamic. The nature of dynamic is in general similar in all six biotypes and the difference is evident only in total number of nematodes.Keywords: Nematoda, dynamic, highland, ecosystem
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13397497 Coastal Resources Spatial Planning and Potential Oil Risk Analysis: Case Study of Misratah’s Coastal Resources, Libya
Authors: Abduladim Maitieg, Kevin Lynch, Mark Johnson
Abstract:
The goal of the Libyan Environmental General Authority (EGA) and National Oil Corporation (Department of Health, Safety & Environment) during the last 5 years has been to adopt a common approach to coastal and marine spatial planning. Protection and planning of the coastal zone is a significant for Libya, due to the length of coast and, the high rate of oil export, and spills’ potential negative impacts on coastal and marine habitats. Coastal resource scenarios constitute an important tool for exploring the long-term and short-term consequences of oil spill impact and available response options that would provide an integrated perspective on mitigation. To investigate that, this paper reviews the Misratah coastal parameters to present the physical and human controls and attributes of coastal habitats as the first step in understanding how they may be damaged by an oil spill. This paper also investigates costal resources, providing a better understanding of the resources and factors that impact the integrity of the ecosystem. Therefore, the study described the potential spatial distribution of oil spill risk and the coastal resources value, and also created spatial maps of coastal resources and their vulnerability to oil spills along the coast. This study proposes an analysis of coastal resources condition at a local level in the Misratah region of the Mediterranean Sea, considering the implementation of coastal and marine spatial planning over time as an indication of the will to manage urban development. Oil spill contamination analysis and their impact on the coastal resources depend on (1) oil spill sequence, (2) oil spill location, (3) oil spill movement near the coastal area. The resulting maps show natural, socio-economic activity, environmental resources along of the coast, and oil spill location. Moreover, the study provides significant geodatabase information which is required for coastal sensitivity index mapping and coastal management studies. The outcome of study provides the information necessary to set an Environmental Sensitivity Index (ESI) for the Misratah shoreline, which can be used for management of coastal resources and setting boundaries for each coastal sensitivity sectors, as well as to help planners measure the impact of oil spills on coastal resources. Geographic Information System (GIS) tools were used in order to store and illustrate the spatial convergence of existing socio-economic activities such as fishing, tourism, and the salt industry, and ecosystem components such as sea turtle nesting area, Sabkha habitats, and migratory birds feeding sites. These geodatabases help planners investigate the vulnerability of coastal resources to an oil spill.
Keywords: Coastal and marine spatial planning advancement training, GIS mapping, human uses, ecosystem components, Misratah coast, Libyan, oil spill.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9597496 Data Preprocessing for Supervised Leaning
Authors: S. B. Kotsiantis, D. Kanellopoulos, P. E. Pintelas
Abstract:
Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.Keywords: Data mining, feature selection, data cleaning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 60897495 Applications of Big Data in Education
Authors: Faisal Kalota
Abstract:
Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.Keywords: Analytics, Big Data in Education, Hadoop, Learning Analytics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 48737494 Research of Data Cleaning Methods Based on Dependency Rules
Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin
Abstract:
This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.Keywords: Data cleaning, dependency rules, violation data discovery, data repair.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26117493 Coalescing Data Marts
Authors: N. Parimala, P. Pahwa
Abstract:
OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.Keywords: Data warehouse, Dimension, OLAP, Star Schema.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15587492 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity
Authors: Hoda A. Abdel Hafez
Abstract:
Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24807491 A Post Keynesian Environmental Macroeconomic Model for Agricultural Water Sustainability under Climate Change in the Murray-Darling Basin, Australia
Authors: Ke Zhao, Ballarat Colin Richardson, Jerry Courvisanos, John Crawford
Abstract:
Climate change has profound consequences for the agriculture of south-eastern Australia and its climate-induced water shortage in the Murray-Darling Basin. Post Keynesian Economics (PKE) macro-dynamics, along with Kaleckian investment and growth theory, are used to develop an ecological-economic system dynamics model of this complex nonlinear river basin system. The Murray- Darling Basin Simulation Model (MDB-SM) uses the principles of PKE to incorporate the fundamental uncertainty of economic behaviors of farmers regarding the investments they make and the climate change they face, particularly as regards water ecosystem services. MDB-SM provides a framework for macroeconomic policies, especially for long-term fiscal policy and for policy directed at the sustainability of agricultural water, as measured by socio-economic well-being considerations, which include sustainable consumption and investment in the river basin. The model can also reproduce other ecological and economic aspects and, for certain parameters and initial values, exhibit endogenous business cycles and ecological sustainability with realistic characteristics. Most importantly, MDBSM provides a platform for the analysis of alternative economic policy scenarios. These results reveal the importance of understanding water ecosystem adaptation under climate change by integrating a PKE macroeconomic analytical framework with the system dynamics modelling approach. Once parameterised and supplied with historical initial values, MDB-SM should prove to be a practical tool to provide alternative long-term policy simulations of agricultural water and socio-economic well-being.
Keywords: Agricultural water, Macroeconomic dynamics, Modeling, Investment dynamics, Sustainability, Unemployment, Economics, Keynesian, Kaleckian.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21727490 Advancing the Hi-Tech Ecosystem in the Periphery: The Case of the Sea of Galilee Region
Authors: Yael Dubinsky, Orit Hazzan
Abstract:
There is a constant need for hi-tech innovation to be decentralized to peripheral regions. This work describes how we applied Design Science Research (DSR) principles to define what we refer to as the Sea of Galilee (SoG) method. The goal of the SoG method is to harness existing and new technological initiatives in peripheral regions to create a socio-technological network that can initiate and maintain hi-tech activities. The SoG method consists of a set of principles, a stakeholder network, and actual hi-tech business initiatives, including their infrastructure and practices. The three cycles of DSR, the Relevance, Design, and Rigor cycles, lay out a research framework to sharpen the requirements, collect data from case studies, and iteratively refine the SoG method based on the existing knowledge base. We propose that the SoG method can be deployed by regional authorities that wish to be considered as smart regions (an extension of the notion of smart cities).
Keywords: Design Science Research, socio-technological initiatives, Sea of Galilee method, periphery stakeholder network, hi-tech initiatives.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3217489 Comparative Analysis of Diverse Collection of Big Data Analytics Tools
Authors: S. Vidhya, S. Sarumathi, N. Shanthi
Abstract:
Over the past era, there have been a lot of efforts and studies are carried out in growing proficient tools for performing various tasks in big data. Recently big data have gotten a lot of publicity for their good reasons. Due to the large and complex collection of datasets it is difficult to process on traditional data processing applications. This concern turns to be further mandatory for producing various tools in big data. Moreover, the main aim of big data analytics is to utilize the advanced analytic techniques besides very huge, different datasets which contain diverse sizes from terabytes to zettabytes and diverse types such as structured or unstructured and batch or streaming. Big data is useful for data sets where their size or type is away from the capability of traditional relational databases for capturing, managing and processing the data with low-latency. Thus the out coming challenges tend to the occurrence of powerful big data tools. In this survey, a various collection of big data tools are illustrated and also compared with the salient features.
Keywords: Big data, Big data analytics, Business analytics, Data analysis, Data visualization, Data discovery.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 37757488 Native Plants Marketing by Entrepreneurs in the Landscaping Industry in Japan
Authors: Yuki Hara
Abstract:
Entrepreneurs are welcomed to the landscaping industry, conserving practically and theoretically biological diversity in landscaping construction, although there are limited reports on corporative trials making a market with a new logistics system of native plants (NP) between landscaping companies and nurserymen. This paper explores the entrepreneurial process of a landscaping company, “5byMidori” for NP marketing. This paper employs a case study design. Data are collected in interviews with the manager and designer of 5byMidori, 2 scientists, 1 organization, and 18 nurserymen, fieldworks at two nurseries, observations of marketing activities in three years, and texts from published documents about the business concept and marketing strategy with NP. These data are analyzed by qualitative methods. The results show that NP is suitable for the vision of 5byMidori improving urban desertified environment with closer urban-rural linkage. Professional landscaping team changes a forestry organization into NP producers conserving a large nursery of a mountain. Multifaceted PR based on the entrepreneurial context and personal background of a landscaping venture can foster team members' businesses and help customers and users to understand the biodiversity value of the product. Wider partnerships with existing nurserymen at other sites in many regions need socio-economic incentives and environmental reliability. In conclusion, the entrepreneurial marketing of a landscaping company needs to add more meanings and a variety of merits in terms of ecosystem services, as NP tends to be in academic definition and independent from the cultures like nurseryman and forestry.
Keywords: Biological diversity, landscaping industry, marketing, native plants.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7197487 Multi-labeled Data Expressed by a Set of Labels
Authors: Tetsuya Furukawa, Masahiro Kuzunishi
Abstract:
Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.
Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13037486 The Effect of Multiple Environmental Conditions on Acacia Senegal Seedling’s Carbon, Nitrogen, and Hydrogen Contents: An Experimental Investigation
Authors: Abdoelmoniem A. Attaelmanan, Ahmed A. H. Siddig
Abstract:
This study was conducted in light of continual global climate changes that projected increasing aridity, changes in soil fertility, and pollution. Plant growth and development largely depend on the combination of availing water and nutrients in the soil. Changes in the climate and atmospheric chemistry can cause serious effects on these growth factors. Plant carbon (C), nitrogen (N), and hydrogen (H) play a fundamental role in the maintenance of ecosystem structure and function. Hashab (Acacia senegal), which produces gum Arabic, supports dryland ecosystems in tropical zones by its potentiality to restore degraded soils; hence, it is ecologically and economically important for the dry areas of sub-Saharan Africa. The study aims at investigating the effects of water stress (simulated drought) and poor soil type on Acacia senegal C, N, and H contents. Seven-day-old seedlings were assigned to the treatments in split-plot design for four weeks. The main plot is irrigation interval (well-watered and water-stressed), and the subplot is soil types (silt and sandy soils). Seedling's C%, N%, and H% were measured using CHNS-O Analyzer and applying Standard Test Method. Irrigation intervals and soil types had no effects on seedlings and leaves C%, N%, and H%, irrigation interval had affected stem C% and H%, both irrigation intervals and soil types had affected root N% and interaction effect of water and soil was found on leaves and root's N%. Application of well-watered irrigation with soil that is rich in N and other nutrients would result in the greatest seedling C, N, and H content which will enhance growth and biomass accumulation and can play a crucial role in ecosystem productivity and services in the dryland regions.
Keywords: Acacia senegal, Africa, climate change, drylands, nutrients biomass, Sub-Sahara, Sudan.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5337485 Optimizing Usability Testing with Collaborative Method in an E-Commerce Ecosystem
Authors: Markandeya Kunchi
Abstract:
Usability testing (UT) is one of the vital steps in the User-centred design (UCD) process when designing a product. In an e-commerce ecosystem, UT becomes primary as new products, features, and services are launched very frequently. And, there are losses attached to the company if an unusable and inefficient product is put out to market and is rejected by customers. This paper tries to answer why UT is important in the product life-cycle of an E-commerce ecosystem. Secondary user research was conducted to find out work patterns, development methods, type of stakeholders, and technology constraints, etc. of a typical E-commerce company. Qualitative user interviews were conducted with product managers and designers to find out the structure, project planning, product management method and role of the design team in a mid-level company. The paper tries to address the usual apprehensions of the company to inculcate UT within the team. As well, it stresses upon factors like monetary resources, lack of usability expert, narrow timelines, and lack of understanding of higher management as some primary reasons. Outsourcing UT to vendors is also very prevalent with mid-level e-commerce companies, but it has its own severe repercussions like very little team involvement, huge cost, misinterpretation of the findings, elongated timelines, and lack of empathy towards the customer, etc. The shortfalls of the unavailability of a UT process in place within the team and conducting UT through vendors are bad user experiences for customers while interacting with the product, badly designed products which are neither useful and nor utilitarian. As a result, companies see dipping conversions rates in apps and websites, huge bounce rates and increased uninstall rates. Thus, there was a need for a more lean UT system in place which could solve all these issues for the company. This paper highlights on optimizing the UT process with a collaborative method. The degree of optimization and structure of collaborative method is the highlight of this paper. Collaborative method of UT is one in which the centralised design team of the company takes for conducting and analysing the UT. The UT is usually a formative kind where designers take findings into account and uses in the ideation process. The success of collaborative method of UT is due to its ability to sync with the product management method employed by the company or team. The collaborative methods focus on engaging various teams (design, marketing, product, administration, IT, etc.) each with its own defined roles and responsibility in conducting a smooth UT with users In-house. The paper finally highlights the positive results of collaborative UT method after conducting more than 100 In-lab interviews with users across the different lines of businesses. Some of which are the improvement of interaction between stakeholders and the design team, empathy towards users, improved design iteration, better sanity check of design solutions, optimization of time and money, effective and efficient design solution. The future scope of collaborative UT is to make this method leaner, by reducing the number of days to complete the entire project starting from planning between teams to publishing the UT report.
Keywords: Usability testing, collaborative method, e-commerce, product management method.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6677484 The Comparison of Data Replication in Distributed Systems
Authors: Iman Zangeneh, Mostafa Moradi, Ali Mokhtarbaf
Abstract:
The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.Keywords: data replication, data hiding, consistency, dynamicdata replication strategy
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16347483 Hyperspectral Mapping Methods for Differentiating Mangrove Species along Karachi Coast
Authors: Sher Muhammad, Mirza Muhammad Waqar
Abstract:
It is necessary to monitor and identify mangroves types and spatial extent near coastal areas because it plays an important role in coastal ecosystem and environmental protection. This research aims at identifying and mapping mangroves types along Karachi coast ranging from 24.790 to 24.850 in latitude and 66.910 to 66.970 in longitude using hyperspectral remote sensing data and techniques. Image acquired during February, 2012 through Hyperion sensor have been used for this research. Image pre processing includes geometric and radiometric correction followed by Minimum Noise Fraction (MNF) and Pixel Purity Index (PPI). The output of MNF and PPI has been analyzed by visualizing it in n-dimensions for end member extraction. Well distributed clusters on the n-dimensional scatter plot have been selected with the region of interest (ROI) tool as end members. These end members have been used as an input for classification techniques applied to identify and map mangroves species including Spectral Angle Mapper (SAM), Spectral Feature Fitting (SFF) and Spectral Information Diversion (SID). Only two types of mangroves namely Avicennia Marina (White Mangroves) and Avicennia germinans (Black Mangroves) have been observed throughout the study area.
Keywords: Mangrove, Hyperspectral, SAM, SFF, SID.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2906