Search results for: big data ecosystem
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25752

Search results for: big data ecosystem

24552 Estimating Tree Height and Forest Classification from Multi Temporal Risat-1 HH and HV Polarized Satellite Aperture Radar Interferometric Phase Data

Authors: Saurav Kumar Suman, P. Karthigayani

Abstract:

In this paper the height of the tree is estimated and forest types is classified from the multi temporal RISAT-1 Horizontal-Horizontal (HH) and Horizontal-Vertical (HV) Polarised Satellite Aperture Radar (SAR) data. The novelty of the proposed project is combined use of the Back-scattering Coefficients (Sigma Naught) and the Coherence. It uses Water Cloud Model (WCM). The approaches use two main steps. (a) Extraction of the different forest parameter data from the Product.xml, BAND-META file and from Grid-xxx.txt file come with the HH & HV polarized data from the ISRO (Indian Space Research Centre). These file contains the required parameter during height estimation. (b) Calculation of the Vegetation and Ground Backscattering, Coherence and other Forest Parameters. (c) Classification of Forest Types using the ENVI 5.0 Tool and ROI (Region of Interest) calculation.

Keywords: RISAT-1, classification, forest, SAR data

Procedia PDF Downloads 404
24551 Presenting a Model for Predicting the State of Being Accident-Prone of Passages According to Neural Network and Spatial Data Analysis

Authors: Hamd Rezaeifar, Hamid Reza Sahriari

Abstract:

Accidents are considered to be one of the challenges of modern life. Due to the fact that the victims of this problem and also internal transportations are getting increased day by day in Iran, studying effective factors of accidents and identifying suitable models and parameters about this issue are absolutely essential. The main purpose of this research has been studying the factors and spatial data affecting accidents of Mashhad during 2007- 2008. In this paper it has been attempted to – through matching spatial layers on each other and finally by elaborating them with the place of accident – at the first step by adding landmarks of the accident and through adding especial fields regarding the existence or non-existence of effective phenomenon on accident, existing information banks of the accidents be completed and in the next step by means of data mining tools and analyzing by neural network, the relationship between these data be evaluated and a logical model be designed for predicting accident-prone spots with minimum error. The model of this article has a very accurate prediction in low-accident spots; yet it has more errors in accident-prone regions due to lack of primary data.

Keywords: accident, data mining, neural network, GIS

Procedia PDF Downloads 45
24550 Methodology of the Turkey’s National Geographic Information System Integration Project

Authors: Buse A. Ataç, Doğan K. Cenan, Arda Çetinkaya, Naz D. Şahin, Köksal Sanlı, Zeynep Koç, Akın Kısa

Abstract:

With its spatial data reliability, interpretation and questioning capabilities, Geographical Information Systems make significant contributions to scientists, planners and practitioners. Geographic information systems have received great attention in today's digital world, growing rapidly, and increasing the efficiency of use. Access to and use of current and accurate geographical data, which are the most important components of the Geographical Information System, has become a necessity rather than a need for sustainable and economic development. This project aims to enable sharing of data collected by public institutions and organizations on a web-based platform. Within the scope of the project, INSPIRE (Infrastructure for Spatial Information in the European Community) data specifications are considered as a road-map. In this context, Turkey's National Geographic Information System (TUCBS) Integration Project supports sharing spatial data within 61 pilot public institutions as complied with defined national standards. In this paper, which is prepared by the project team members in the TUCBS Integration Project, the technical process with a detailed methodology is explained. In this context, the main technical processes of the Project consist of Geographic Data Analysis, Geographic Data Harmonization (Standardization), Web Service Creation (WMS, WFS) and Metadata Creation-Publication. In this paper, the integration process carried out to provide the data produced by 61 institutions to be shared from the National Geographic Data Portal (GEOPORTAL), have been trying to be conveyed with a detailed methodology.

Keywords: data specification, geoportal, GIS, INSPIRE, Turkish National Geographic Information System, TUCBS, Turkey's national geographic information system

Procedia PDF Downloads 142
24549 Secure Content Centric Network

Authors: Syed Umair Aziz, Muhammad Faheem, Sameer Hussain, Faraz Idris

Abstract:

Content centric network is the network based on the mechanism of sending and receiving the data based on the interest and data request to the specified node (which has cached data). In this network, the security is bind with the content not with the host hence making it host independent and secure. In this network security is applied by taking content’s MAC (message authentication code) and encrypting it with the public key of the receiver. On the receiver end, the message is first verified and after verification message is saved and decrypted using the receiver's private key.

Keywords: content centric network, client-server, host security threats, message authentication code, named data network, network caching, peer-to-peer

Procedia PDF Downloads 642
24548 Fuel Inventory/ Depletion Analysis for a Thorium-Uranium Dioxide (Th-U) O2 Pin Cell Benchmark Using Monte Carlo and Deterministic Codes with New Version VIII.0 of the Evaluated Nuclear Data File (ENDF/B) Nuclear Data Library

Authors: Jamal Al-Zain, O. El Hajjaji, T. El Bardouni

Abstract:

A (Th-U) O2 fuel pin benchmark made up of 25 w/o U and 75 w/o Th was used. In order to analyze the depletion and inventory of the fuel for the pressurized water reactor pin-cell model. The new version VIII.0 of the ENDF/B nuclear data library was used to create a data set in ACE format at various temperatures and process the data using the MAKXSF6.2 and NJOY2016 programs to process the data at the various temperatures in order to conduct this study and analyze cross-section data. The infinite multiplication factor, the concentrations and activities of the main fission products, the actinide radionuclides accumulated in the pin cell, and the total radioactivity were all estimated and compared in this study using the Monte Carlo N-Particle 6 (MCNP6.2) and DRAGON5 programs. Additionally, the behavior of the Pressurized Water Reactor (PWR) thorium pin cell that is dependent on burn-up (BU) was validated and compared with the reference data obtained using the Massachusetts Institute of Technology (MIT-MOCUP), Idaho National Engineering and Environmental Laboratory (INEEL-MOCUP), and CASMO-4 codes. The results of this study indicate that all of the codes examined have good agreements.

Keywords: PWR thorium pin cell, ENDF/B-VIII.0, MAKXSF6.2, NJOY2016, MCNP6.2, DRAGON5, fuel burn-up.

Procedia PDF Downloads 99
24547 Natural Language News Generation from Big Data

Authors: Bastian Haarmann, Likas Sikorski

Abstract:

In this paper, we introduce an NLG application for the automatic creation of ready-to-publish texts from big data. The fully automatic generated stories have a high resemblance to the style in which the human writer would draw up a news story. Topics may include soccer games, stock exchange market reports, weather forecasts and many more. The generation of the texts runs according to the human language production. Each generated text is unique. Ready-to-publish stories written by a computer application can help humans to quickly grasp the outcomes of big data analyses, save time-consuming pre-formulations for journalists and cater to rather small audiences by offering stories that would otherwise not exist.

Keywords: big data, natural language generation, publishing, robotic journalism

Procedia PDF Downloads 430
24546 Performance Evaluation of the Classic seq2seq Model versus a Proposed Semi-supervised Long Short-Term Memory Autoencoder for Time Series Data Forecasting

Authors: Aswathi Thrivikraman, S. Advaith

Abstract:

The study is aimed at designing encoders for deciphering intricacies in time series data by redescribing the dynamics operating on a lower-dimensional manifold. A semi-supervised LSTM autoencoder is devised and investigated to see if the latent representation of the time series data can better forecast the data. End-to-end training of the LSTM autoencoder, together with another LSTM network that is connected to the latent space, forces the hidden states of the encoder to represent the most meaningful latent variables relevant for forecasting. Furthermore, the study compares the predictions with those of a traditional seq2seq model.

Keywords: LSTM, autoencoder, forecasting, seq2seq model

Procedia PDF Downloads 152
24545 The Analysis of Emergency Shutdown Valves Torque Data in Terms of Its Use as a Health Indicator for System Prognostics

Authors: Ewa M. Laskowska, Jorn Vatn

Abstract:

Industry 4.0 focuses on digital optimization of industrial processes. The idea is to use extracted data in order to build a decision support model enabling use of those data for real time decision making. In terms of predictive maintenance, the desired decision support tool would be a model enabling prognostics of system's health based on the current condition of considered equipment. Within area of system prognostics and health management, a commonly used health indicator is Remaining Useful Lifetime (RUL) of a system. Because the RUL is a random variable, it has to be estimated based on available health indicators. Health indicators can be of different types and come from different sources. They can be process variables, equipment performance variables, data related to number of experienced failures, etc. The aim of this study is the analysis of performance variables of emergency shutdown valves (ESV) used in oil and gas industry. ESV is inspected periodically, and at each inspection torque and time of valve operation are registered. The data will be analyzed by means of machine learning or statistical analysis. The purpose is to investigate whether the available data could be used as a health indicator for a prognostic purpose. The second objective is to examine what is the most efficient way to incorporate the data into predictive model. The idea is to check whether the data can be applied in form of explanatory variables in Markov process or whether other stochastic processes would be a more convenient to build an RUL model based on the information coming from registered data.

Keywords: emergency shutdown valves, health indicator, prognostics, remaining useful lifetime, RUL

Procedia PDF Downloads 90
24544 Achieving Sustainable Lifestyles Based on the Spiritual Teaching and Values of Buddhism from Lumbini, Nepal

Authors: Purna Prasad Acharya, Madhav Karki, Sunta B. Tamang, Uttam Basnet, Chhatra Katwal

Abstract:

The paper outlines the idea behind achieving sustainable lifestyles based on the spiritual values and teachings of Lord Buddha. This objective is to be achieved by spreading the tenets and teachings of Buddhism throughout the Asia Pacific region and the world from the sacred birth place of Buddha - Lumbini, Nepal. There is an urgent need to advance the relevance of Buddhist philosophy in tackling the triple planetary crisis of climate change, nature’s decline, and pollution. Today, the world is facing an existential crisis due to the above crises, exasperated by hunger, poverty and armed conflict. To address multi-dimensional impacts, the global communities have to adopt simple life styles that respect nature and universal human values. These were the basic teachings of Gautam Buddha. Lumbini, Nepal has the moral obligation to widely disseminate Buddha’s teaching to the world and receive constant feedback and learning to develop human and ecosystem resilience by molding the lifestyles of current and future generations through adaptive learning and simplicity across the geography and nationality based on spirituality and environmental stewardship. By promoting Buddhism, Nepal has developed a pro-nature tourism industry that focuses on both its spiritual and bio-cultural heritage. Nepal is a country rich in ancient wisdom, where sages have sought knowledge, practiced meditation, and followed spiritual paths for thousands of years. It can spread the teachings of Buddha in a way people can search for and adopt ways to live, creating harmony with nature. Using tools of natural sciences and social sciences, the team will package knowledge and share the idea of community well-being within the framework of environmental sustainability, social harmony and universal respect for nature and people in a more holistic manner. This notion takes into account key elements of sustainable development such as food-energy-water-biodiversity interconnections, environmental conservation, ecological integrity, ecosystem health, community resiliency, adaptation capacity, and indigenous culture, knowledge and values. This inclusive concept has garnered a strong network of supporters locally, regionally, and internationally. The key objectives behind this concept are: a) to leverage expertise and passion of a network of global collaborators to advance research, education, and policy outreach in the areas of human sustainability based on lifestyle change using the power of spirituality and Buddha’s teaching, resilient lifestyles, and adaptive living; b) help develop creative short courses for multi-disciplinary teaching in educational institutions worldwide in collaboration with Lumbini Buddha University and other relevant partners in Nepal; c) help build local and regional intellectual and cultural teaching and learning capacity by improving professional collaborations to promote nature based and Buddhist value-based lifestyles by connecting Lumbini to Nepal’s rich nature; d) promote research avenues to provide policy relevant knowledge that is creative, innovative, as well as practical and locally viable; and e) connect local research and outreach work with academic and cultural partners in South Korea so as to open up Lumbini based Buddhist heritage and Nepal’s Karnali River basin’s unique natural landscape to Korean scholars and students to promote sustainable lifestyles leading to human living in harmony with nature.

Keywords: triple planetary crisis, spirituality, sustainable lifestyles, living in harmony with nature, resilience

Procedia PDF Downloads 32
24543 Block Mining: Block Chain Enabled Process Mining Database

Authors: James Newman

Abstract:

Process mining is an emerging technology that looks to serialize enterprise data in time series data. It has been used by many companies and has been the subject of a variety of research papers. However, the majority of current efforts have looked at how to best create process mining from standard relational databases. This paper is the first pass at outlining a database custom-built for the minimal viable product of process mining. We present Block Miner, a blockchain protocol to store process mining data across a distributed network. We demonstrate the feasibility of storing process mining data on the blockchain. We present a proof of concept and show how the intersection of these two technologies helps to solve a variety of issues, including but not limited to ransomware attacks, tax documentation, and conflict resolution.

Keywords: blockchain, process mining, memory optimization, protocol

Procedia PDF Downloads 101
24542 Vulnerability of Groundwater to Pollution in Akwa Ibom State, Southern Nigeria, using the DRASTIC Model and Geographic Information System (GIS)

Authors: Aniedi A. Udo, Magnus U. Igboekwe, Rasaaq Bello, Francis D. Eyenaka, Michael C. Ohakwere-Eze

Abstract:

Groundwater vulnerability to pollution was assessed in Akwa Ibom State, Southern Nigeria, with the aim of locating areas with high potentials for resource contamination, especially due to anthropogenic influence. The electrical resistivity method was utilized in the collection of the initial field data. Additional data input, which included depth to static water level, drilled well log data, aquifer recharge data, percentage slope, as well as soil information, were sourced from secondary sources. The initial field data were interpreted both manually and with computer modeling to provide information on the geoelectric properties of the subsurface. Interpreted results together with the secondary data were used to develop the DRASTIC thematic maps. A vulnerability assessment was performed using the DRASTIC model in a GIS environment and areas with high vulnerability which needed immediate attention was clearly mapped out and presented using an aquifer vulnerability map. The model was subjected to validation and the rate of validity was 73% within the area of study.

Keywords: groundwater, vulnerability, DRASTIC model, pollution

Procedia PDF Downloads 206
24541 Assessment of Potentially Harmful Elements in Floodplain Soils and Stream Sediments in Ile-Ife Area, South-Western Nigeria: Using Geographic Information System and Multi-Variances Approaches

Authors: I. T. Asowata, A. S. Akinwumiju

Abstract:

The enrichment of potentially harmful elements (PHEs) in stream sediments (SS) and floodplain soils (FS) poses great environmental hazards to water bodies and other parts of the ecosystem. The aim of this research was to assess the distribution pattern of selected PHEs (Cu, Pb, Zn, Co, Mn, As, Cd, V, Cr, Ni, Th, Sr, and La) in SS of selected rivers that drain Ile-Ife area and their adjacent FS, to ascertain the pollution status of these elements in the study area. 60 samples (40 SS and 20 FS) were purposely collected for this study; the samples were air-dried at room temperature, disaggregated, sieved with > 63 µm and digested with modified aqua reqia (1:1:1 HCl:HNO₃:H₂O) and were analysed with ultra-trace inductively coupled plasma mass spectrometry method (ICP-ES). The geochemical results showed decreasing trend of average contents of PHEs studied Mn > Zn > V > Cr > Pb > La > Sr > Cu > Ni > Co > Th > As > Cd for both SS and FS. Floodplain topsoil in ppm, Cu range from 10.0-180.0; mean, 71.1, Pb, 17.1-255.0; 93.5 and Zn, 83.0-3122.2; 826.0. Also, floodplain sub-soils, Cu range from 30.0-203.1; mean of 76.6, Pb, 16.0-214.0; 77.9 and Zn, 59.1-2351.0; 622.3. Similarly, SS results for Cu, 22.1-257.0; 70.3, Pb, 15.0-172.0; 67.3 and Zn, 65.0-1285.0; 357.8, among other PHEs, suggesting significant level of PHEs enrichment in the studied geo media. Elemental association showed positive and/or negative correlation among the PHEs and also showed different sources of metal enrichment to be largely anthropogenic with some geogenic. Geoaccumulation and metal ratio indexes indicated that FS and SS studied have received significant PHEs of between moderately to strongly polluted, which implies significant environmental implications in the study area.

Keywords: aqua regia, enrichment, GIS, Ile-Ife, potentially harmful elements

Procedia PDF Downloads 158
24540 Broad Survey of Fine Root Traits to Investigate the Root Economic Spectrum Hypothesis and Plant-Fire Dynamics Worldwide

Authors: Jacob Lewis Watts, Adam F. A. Pellegrini

Abstract:

Prairies, grasslands, and forests cover an expansive portion of the world’s surface and contribute significantly to Earth’s carbon cycle. The largest driver of carbon dynamics in some of these ecosystems is fire. As the global climate changes, most fire-dominated ecosystems will experience increased fire frequency and intensity, leading to increased carbon flux into the atmosphere and soil nutrient depletion. The plant communities associated with different fire regimes are important for reassimilation of carbon lost during fire and soil recovery. More frequent fires promote conservative plant functional traits aboveground; however, belowground fine root traits are poorly explored and arguably more important drivers of ecosystem function as the primary interface between the soil and plant. The root economic spectrum (RES) hypothesis describes single-dimensional covariation between important fine-root traits along a range of plant strategies from acquisitive to conservative – parallel to the well-established leaf economic spectrum (LES). However, because of the paucity of root trait data, the complex nature of the rhizosphere, and the phylogenetic conservatism of root traits, it is unknown whether the RES hypothesis accurately describes plant nutrient and water acquisition strategies. This project utilizesplants grown in common garden conditions in the Cambridge University Botanic Garden and a meta-analysis of long-term fire manipulation experiments to examine the belowground physiological traits of fire-adapted and non-fire-adapted herbaceous species to 1) test the RES hypothesis and 2) describe the effect of fire regimes on fine root functional traits – which in turn affect carbon and nutrient cycling. A suite of morphological, chemical, and biological root traits (e.g. root diameter, specific root length, percent N, percent mycorrhizal colonization, etc.) of 50 herbaceous species were measuredand tested for phylogenetic conservatism and RES dimensionality. Fire-adapted and non-fire-adapted plants traits were compared using phylogenetic PCA techniques. Preliminary evidence suggests that phylogenetic conservatism may weaken the single-dimensionality of the RES, suggesting that there may not be a single way that plants optimize nutrient and water acquisition and storage in the complex rhizosphere; additionally, fire-adapted species are expected to be more conservative than non-fire-adapted species, which may be indicative of slower carbon cycling with increasing fire frequency and intensity.

Keywords: climate change, fire regimes, root economic spectrum, fine roots

Procedia PDF Downloads 122
24539 A Review Paper on Data Security in Precision Agriculture Using Internet of Things

Authors: Tonderai Muchenje, Xolani Mkhwanazi

Abstract:

Precision agriculture uses a number of technologies, devices, protocols, and computing paradigms to optimize agricultural processes. Big data, artificial intelligence, cloud computing, and edge computing are all used to handle the huge amounts of data generated by precision agriculture. However, precision agriculture is still emerging and has a low level of security features. Furthermore, future solutions will demand data availability and accuracy as key points to help farmers, and security is important to build robust and efficient systems. Since precision agriculture comprises a wide variety and quantity of resources, security addresses issues such as compatibility, constrained resources, and massive data. Moreover, conventional protection schemes used in the traditional internet may not be useful for agricultural systems, creating extra demands and opportunities. Therefore, this paper aims at reviewing state of the art of precision agriculture security, particularly in open field agriculture, discussing its architecture, describing security issues, and presenting the major challenges and future directions.

Keywords: precision agriculture, security, IoT, EIDE

Procedia PDF Downloads 88
24538 Commercial Automobile Insurance: A Practical Approach of the Generalized Additive Model

Authors: Nicolas Plamondon, Stuart Atkinson, Shuzi Zhou

Abstract:

The insurance industry is usually not the first topic one has in mind when thinking about applications of data science. However, the use of data science in the finance and insurance industry is growing quickly for several reasons, including an abundance of reliable customer data, ferocious competition requiring more accurate pricing, etc. Among the top use cases of data science, we find pricing optimization, customer segmentation, customer risk assessment, fraud detection, marketing, and triage analytics. The objective of this paper is to present an application of the generalized additive model (GAM) on a commercial automobile insurance product: an individually rated commercial automobile. These are vehicles used for commercial purposes, but for which there is not enough volume to apply pricing to several vehicles at the same time. The GAM model was selected as an improvement over GLM for its ease of use and its wide range of applications. The model was trained using the largest split of the data to determine model parameters. The remaining part of the data was used as testing data to verify the quality of the modeling activity. We used the Gini coefficient to evaluate the performance of the model. For long-term monitoring, commonly used metrics such as RMSE and MAE will be used. Another topic of interest in the insurance industry is to process of producing the model. We will discuss at a high level the interactions between the different teams with an insurance company that needs to work together to produce a model and then monitor the performance of the model over time. Moreover, we will discuss the regulations in place in the insurance industry. Finally, we will discuss the maintenance of the model and the fact that new data does not come constantly and that some metrics can take a long time to become meaningful.

Keywords: insurance, data science, modeling, monitoring, regulation, processes

Procedia PDF Downloads 74
24537 Modeling Pan Evaporation Using Intelligent Methods of ANN, LSSVM and Tree Model M5 (Case Study: Shahroud and Mayamey Stations)

Authors: Hamidreza Ghazvinian, Khosro Ghazvinian, Touba Khodaiean

Abstract:

The importance of evaporation estimation in water resources and agricultural studies is undeniable. Pan evaporation are used as an indicator to determine the evaporation of lakes and reservoirs around the world due to the ease of interpreting its data. In this research, intelligent models were investigated in estimating pan evaporation on a daily basis. Shahroud and Mayamey were considered as the studied cities. These two cities are located in Semnan province in Iran. The mentioned cities have dry weather conditions that are susceptible to high evaporation potential. Meteorological data of 11 years of synoptic stations of Shahrood and Mayamey cities were used. The intelligent models used in this study are Artificial Neural Network (ANN), Least Squares Support Vector Machine (LSSVM), and M5 tree models. Meteorological parameters of minimum and maximum air temperature (Tmax, Tmin), wind speed (WS), sunshine hours (SH), air pressure (PA), relative humidity (RH) as selected input data and evaporation data from pan (EP) to The output data was considered. 70% of data is used at the education level, and 30 % of the data is used at the test level. Models used with explanation coefficient evaluation (R2) Root of Mean Squares Error (RMSE) and Mean Absolute Error (MAE). The results for the two Shahroud and Mayamey stations showed that the above three models' operations are rather appropriate.

Keywords: pan evaporation, intelligent methods, shahroud, mayamey

Procedia PDF Downloads 73
24536 Wildland Fire in Terai Arc Landscape of Lesser Himalayas Threatning the Tiger Habitat

Authors: Amit Kumar Verma

Abstract:

The present study deals with fire prediction model in Terai Arc Landscape, one of the most dramatic ecosystems in Asia where large, wide-ranging species such as tiger, rhinos, and elephant will thrive while bringing economic benefits to the local people. Forest fires cause huge economic and ecological losses and release considerable quantities of carbon into the air and is an important factor inflating the global burden of carbon emissions. Forest fire is an important factor of behavioral cum ecological habit of tiger in wild. Post fire changes i.e. micro and macro habitat directly affect the tiger habitat or land. Vulnerability of fire depicts the changes in microhabitat (humus, soil profile, litter, vegetation, grassland ecosystem). Microorganism like spider, annelids, arthropods and other favorable microorganism directly affect by the forest fire and indirectly these entire microorganisms are responsible for the development of tiger (Panthera tigris) habitat. On the other hand, fire brings depletion in prey species and negative movement of tiger from wild to human- dominated areas, which may leads the conflict i.e. dangerous for both tiger & human beings. Early forest fire prediction through mapping the risk zones can help minimize the fire frequency and manage forest fires thereby minimizing losses. Satellite data plays a vital role in identifying and mapping forest fire and recording the frequency with which different vegetation types are affected. Thematic hazard maps have been generated by using IDW technique. A prediction model for fire occurrence is developed for TAL. The fire occurrence records were collected from state forest department from 2000 to 2014. Disciminant function models was used for developing a prediction model for forest fires in TAL, random points for non-occurrence of fire have been generated. Based on the attributes of points of occurrence and non-occurrence, the model developed predicts the fire occurrence. The map of predicted probabilities classified the study area into five classes very high (12.94%), high (23.63%), moderate (25.87%), low(27.46%) and no fire (10.1%) based upon the intensity of hazard. model is able to classify 78.73 percent of points correctly and hence can be used for the purpose with confidence. Overall, also the model works correctly with almost 69% of points. This study exemplifies the usefulness of prediction model of forest fire and offers a more effective way for management of forest fire. Overall, this study depicts the model for conservation of tiger’s natural habitat and forest conservation which is beneficial for the wild and human beings for future prospective.

Keywords: fire prediction model, forest fire hazard, GIS, landsat, MODIS, TAL

Procedia PDF Downloads 350
24535 Generating Insights from Data Using a Hybrid Approach

Authors: Allmin Susaiyah, Aki Härmä, Milan Petković

Abstract:

Automatic generation of insights from data using insight mining systems (IMS) is useful in many applications, such as personal health tracking, patient monitoring, and business process management. Existing IMS face challenges in controlling insight extraction, scaling to large databases, and generalising to unseen domains. In this work, we propose a hybrid approach consisting of rule-based and neural components for generating insights from data while overcoming the aforementioned challenges. Firstly, a rule-based data 2CNL component is used to extract statistically significant insights from data and represent them in a controlled natural language (CNL). Secondly, a BERTSum-based CNL2NL component is used to convert these CNLs into natural language texts. We improve the model using task-specific and domain-specific fine-tuning. Our approach has been evaluated using statistical techniques and standard evaluation metrics. We overcame the aforementioned challenges and observed significant improvement with domain-specific fine-tuning.

Keywords: data mining, insight mining, natural language generation, pre-trained language models

Procedia PDF Downloads 117
24534 Review of K0-Factors and Related Nuclear Data of the Selected Radionuclides for Use in K0-NAA

Authors: Manh-Dung Ho, Van-Giap Pham, Van-Doanh Ho, Quang-Thien Tran, Tuan-Anh Tran

Abstract:

The k0-factors and related nuclear data, i.e. the Q0-factors and effective resonance energies (Ēr) of the selected radionuclides which are used in the k0-based neutron activation analysis (k0-NAA), were critically reviewed to be integrated in the “k0-DALAT” software. The k0- and Q0-factors of some short-lived radionuclides: 46mSc, 110Ag, 116m2In, 165mDy, and 183mW, were experimentally determined at the Dalat research reactor. The other radionuclides selected are: 20F, 36S, 49Ca, 60mCo, 60Co, 75Se, 77mSe, 86mRb, 115Cd, 115mIn, 131Ba, 134mCs, 134Cs, 153Gd, 153Sm, 159Gd, 170Tm, 177mYb, 192Ir, 197mHg, 239U and 239Np. The reviewed data as compared with the literature data were biased within 5.6-7.3% in which the experimental re-determined factors were within 6.1 and 7.3%. The NIST standard reference materials: Oyster Tissue (1566b), Montana II Soil (2711a) and Coal Fly Ash (1633b) were used to validate the new reviewed data showing that the new data gave an improved k0-NAA using the “k0-DALAT” software with a factor of 4.5-6.8% for the investigated radionuclides.

Keywords: neutron activation analysis, k0-based method, k0 factor, Q0 factor, effective resonance energy

Procedia PDF Downloads 123
24533 Optimizing Electric Vehicle Charging with Charging Data Analytics

Authors: Tayyibah Khanam, Mohammad Saad Alam, Sanchari Deb, Yasser Rafat

Abstract:

Electric vehicles are considered as viable replacements to gasoline cars since they help in reducing harmful emissions and stimulate power generation through renewable energy sources, hence contributing to sustainability. However, one of the significant obstacles in the mass deployment of electric vehicles is the charging time anxiety among users and, thus, the subsequent large waiting times for available chargers at charging stations. Data analytics, on the other hand, has revolutionized the decision-making tasks of management and operating systems since its arrival. In this paper, we attempt to optimize the choice of EV charging stations for users in their vicinity by minimizing the time taken to reach the charging stations and the waiting times for available chargers. Time taken to travel to the charging station is calculated by the Google Maps API and the waiting times are predicted by polynomial regression of the historical data stored. The proposed framework utilizes real-time data and historical data from all operating charging stations in the city and assists the user in finding the best suitable charging station for their current situation and can be implemented in a mobile phone application. The algorithm successfully predicts the most optimal choice of a charging station and the minimum required time for various sample data sets.

Keywords: charging data, electric vehicles, machine learning, waiting times

Procedia PDF Downloads 191
24532 Finding Data Envelopment Analysis Targets Using Multi-Objective Programming in DEA-R with Stochastic Data

Authors: R. Shamsi, F. Sharifi

Abstract:

In this paper, we obtain the projection of inefficient units in data envelopment analysis (DEA) in the case of stochastic inputs and outputs using the multi-objective programming (MOP) structure. In some problems, the inputs might be stochastic while the outputs are deterministic, and vice versa. In such cases, we propose a multi-objective DEA-R model because in some cases (e.g., when unnecessary and irrational weights by the BCC model reduce the efficiency score), an efficient decision-making unit (DMU) is introduced as inefficient by the BCC model, whereas the DMU is considered efficient by the DEA-R model. In some other cases, only the ratio of stochastic data may be available (e.g., the ratio of stochastic inputs to stochastic outputs). Thus, we provide a multi-objective DEA model without explicit outputs and prove that the input-oriented MOP DEA-R model in the invariable return to scale case can be replaced by the MOP-DEA model without explicit outputs in the variable return to scale and vice versa. Using the interactive methods for solving the proposed model yields a projection corresponding to the viewpoint of the DM and the analyst, which is nearer to reality and more practical. Finally, an application is provided.

Keywords: DEA-R, multi-objective programming, stochastic data, data envelopment analysis

Procedia PDF Downloads 104
24531 Integrated Model for Enhancing Data Security Processing Time in Cloud Computing

Authors: Amani A. Saad, Ahmed A. El-Farag, El-Sayed A. Helali

Abstract:

Cloud computing is an important and promising field in the recent decade. Cloud computing allows sharing resources, services and information among the people of the whole world. Although the advantages of using clouds are great, but there are many risks in a cloud. The data security is the most important and critical problem of cloud computing. In this research a new security model for cloud computing is proposed for ensuring secure communication system, hiding information from other users and saving the user's times. In this proposed model Blowfish encryption algorithm is used for exchanging information or data, and SHA-2 cryptographic hash algorithm is used for data integrity. For user authentication process a simple user-name and password is used, the password uses SHA-2 for one way encryption. The proposed system shows an improvement of the processing time of uploading and downloading files on the cloud in secure form.

Keywords: cloud computing, data security, SAAS, PAAS, IAAS, Blowfish

Procedia PDF Downloads 356
24530 Comparison of Statistical Methods for Estimating Missing Precipitation Data in the River Subbasin Lenguazaque, Colombia

Authors: Miguel Cañon, Darwin Mena, Ivan Cabeza

Abstract:

In this work was compared and evaluated the applicability of statistical methods for the estimation of missing precipitations data in the basin of the river Lenguazaque located in the departments of Cundinamarca and Boyacá, Colombia. The methods used were the method of simple linear regression, distance rate, local averages, mean rates, correlation with nearly stations and multiple regression method. The analysis used to determine the effectiveness of the methods is performed by using three statistical tools, the correlation coefficient (r2), standard error of estimation and the test of agreement of Bland and Altmant. The analysis was performed using real rainfall values removed randomly in each of the seasons and then estimated using the methodologies mentioned to complete the missing data values. So it was determined that the methods with the highest performance and accuracy in the estimation of data according to conditions that were counted are the method of multiple regressions with three nearby stations and a random application scheme supported in the precipitation behavior of related data sets.

Keywords: statistical comparison, precipitation data, river subbasin, Bland and Altmant

Procedia PDF Downloads 466
24529 Hyperspectral Data Classification Algorithm Based on the Deep Belief and Self-Organizing Neural Network

Authors: Li Qingjian, Li Ke, He Chun, Huang Yong

Abstract:

In this paper, the method of combining the Pohl Seidman's deep belief network with the self-organizing neural network is proposed to classify the target. This method is mainly aimed at the high nonlinearity of the hyperspectral image, the high sample dimension and the difficulty in designing the classifier. The main feature of original data is extracted by deep belief network. In the process of extracting features, adding known labels samples to fine tune the network, enriching the main characteristics. Then, the extracted feature vectors are classified into the self-organizing neural network. This method can effectively reduce the dimensions of data in the spectrum dimension in the preservation of large amounts of raw data information, to solve the traditional clustering and the long training time when labeled samples less deep learning algorithm for training problems, improve the classification accuracy and robustness. Through the data simulation, the results show that the proposed network structure can get a higher classification precision in the case of a small number of known label samples.

Keywords: DBN, SOM, pattern classification, hyperspectral, data compression

Procedia PDF Downloads 340
24528 Assessing Performance of Data Augmentation Techniques for a Convolutional Network Trained for Recognizing Humans in Drone Images

Authors: Masood Varshosaz, Kamyar Hasanpour

Abstract:

In recent years, we have seen growing interest in recognizing humans in drone images for post-disaster search and rescue operations. Deep learning algorithms have shown great promise in this area, but they often require large amounts of labeled data to train the models. To keep the data acquisition cost low, augmentation techniques can be used to create additional data from existing images. There are many techniques of such that can help generate variations of an original image to improve the performance of deep learning algorithms. While data augmentation is potentially assumed to improve the accuracy and robustness of the models, it is important to ensure that the performance gains are not outweighed by the additional computational cost or complexity of implementing the techniques. To this end, it is important to evaluate the impact of data augmentation on the performance of the deep learning models. In this paper, we evaluated the most currently available 2D data augmentation techniques on a standard convolutional network which was trained for recognizing humans in drone images. The techniques include rotation, scaling, random cropping, flipping, shifting, and their combination. The results showed that the augmented models perform 1-3% better compared to a base network. However, as the augmented images only contain the human parts already visible in the original images, a new data augmentation approach is needed to include the invisible parts of the human body. Thus, we suggest a new method that employs simulated 3D human models to generate new data for training the network.

Keywords: human recognition, deep learning, drones, disaster mitigation

Procedia PDF Downloads 91
24527 Emotional Artificial Intelligence and the Right to Privacy

Authors: Emine Akar

Abstract:

The majority of privacy-related regulation has traditionally focused on concepts that are perceived to be well-understood or easily describable, such as certain categories of data and personal information or images. In the past century, such regulation appeared reasonably suitable for its purposes. However, technologies such as AI, combined with ever-increasing capabilities to collect, process, and store “big data”, not only require calibration of these traditional understandings but may require re-thinking of entire categories of privacy law. In the presentation, it will be explained, against the background of various emerging technologies under the umbrella term “emotional artificial intelligence”, why modern privacy law will need to embrace human emotions as potentially private subject matter. This argument can be made on a jurisprudential level, given that human emotions can plausibly be accommodated within the various concepts that are traditionally regarded as the underlying foundation of privacy protection, such as, for example, dignity, autonomy, and liberal values. However, the practical reasons for regarding human emotions as potentially private subject matter are perhaps more important (and very likely more convincing from the perspective of regulators). In that respect, it should be regarded as alarming that, according to most projections, the usefulness of emotional data to governments and, particularly, private companies will not only lead to radically increased processing and analysing of such data but, concerningly, to an exponential growth in the collection of such data. In light of this, it is also necessity to discuss options for how regulators could address this emerging threat.

Keywords: AI, privacy law, data protection, big data

Procedia PDF Downloads 87
24526 Develop a Conceptual Data Model of Geotechnical Risk Assessment in Underground Coal Mining Using a Cloud-Based Machine Learning Platform

Authors: Reza Mohammadzadeh

Abstract:

The major challenges in geotechnical engineering in underground spaces arise from uncertainties and different probabilities. The collection, collation, and collaboration of existing data to incorporate them in analysis and design for given prospect evaluation would be a reliable, practical problem solving method under uncertainty. Machine learning (ML) is a subfield of artificial intelligence in statistical science which applies different techniques (e.g., Regression, neural networks, support vector machines, decision trees, random forests, genetic programming, etc.) on data to automatically learn and improve from them without being explicitly programmed and make decisions and predictions. In this paper, a conceptual database schema of geotechnical risks in underground coal mining based on a cloud system architecture has been designed. A new approach of risk assessment using a three-dimensional risk matrix supported by the level of knowledge (LoK) has been proposed in this model. Subsequently, the model workflow methodology stages have been described. In order to train data and LoK models deployment, an ML platform has been implemented. IBM Watson Studio, as a leading data science tool and data-driven cloud integration ML platform, is employed in this study. As a Use case, a data set of geotechnical hazards and risk assessment in underground coal mining were prepared to demonstrate the performance of the model, and accordingly, the results have been outlined.

Keywords: data model, geotechnical risks, machine learning, underground coal mining

Procedia PDF Downloads 274
24525 Classification of Poverty Level Data in Indonesia Using the Naïve Bayes Method

Authors: Anung Style Bukhori, Ani Dijah Rahajoe

Abstract:

Poverty poses a significant challenge in Indonesia, requiring an effective analytical approach to understand and address this issue. In this research, we applied the Naïve Bayes classification method to examine and classify poverty data in Indonesia. The main focus is on classifying data using RapidMiner, a powerful data analysis platform. The analysis process involves data splitting to train and test the classification model. First, we collected and prepared a poverty dataset that includes various factors such as education, employment, and health..The experimental results indicate that the Naïve Bayes classification model can provide accurate predictions regarding the risk of poverty. The use of RapidMiner in the analysis process offers flexibility and efficiency in evaluating the model's performance. The classification produces several values to serve as the standard for classifying poverty data in Indonesia using Naive Bayes. The accuracy result obtained is 40.26%, with a moderate recall result of 35.94%, a high recall result of 63.16%, and a low recall result of 38.03%. The precision for the moderate class is 58.97%, for the high class is 17.39%, and for the low class is 58.70%. These results can be seen from the graph below.

Keywords: poverty, classification, naïve bayes, Indonesia

Procedia PDF Downloads 53
24524 Web Search Engine Based Naming Procedure for Independent Topic

Authors: Takahiro Nishigaki, Takashi Onoda

Abstract:

In recent years, the number of document data has been increasing since the spread of the Internet. Many methods have been studied for extracting topics from large document data. We proposed Independent Topic Analysis (ITA) to extract topics independent of each other from large document data such as newspaper data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis. The topic represented by ITA is represented by a set of words. However, the set of words is quite different from the topics the user imagines. For example, the top five words with high independence of a topic are as follows. Topic1 = {"scor", "game", "lead", "quarter", "rebound"}. This Topic 1 is considered to represent the topic of "SPORTS". This topic name "SPORTS" has to be attached by the user. ITA cannot name topics. Therefore, in this research, we propose a method to obtain topics easy for people to understand by using the web search engine, topics given by the set of words given by independent topic analysis. In particular, we search a set of topical words, and the title of the homepage of the search result is taken as the topic name. And we also use the proposed method for some data and verify its effectiveness.

Keywords: independent topic analysis, topic extraction, topic naming, web search engine

Procedia PDF Downloads 118
24523 Extracting Terrain Points from Airborne Laser Scanning Data in Densely Forested Areas

Authors: Ziad Abdeldayem, Jakub Markiewicz, Kunal Kansara, Laura Edwards

Abstract:

Airborne Laser Scanning (ALS) is one of the main technologies for generating high-resolution digital terrain models (DTMs). DTMs are crucial to several applications, such as topographic mapping, flood zone delineation, geographic information systems (GIS), hydrological modelling, spatial analysis, etc. Laser scanning system generates irregularly spaced three-dimensional cloud of points. Raw ALS data are mainly ground points (that represent the bare earth) and non-ground points (that represent buildings, trees, cars, etc.). Removing all the non-ground points from the raw data is referred to as filtering. Filtering heavily forested areas is considered a difficult and challenging task as the canopy stops laser pulses from reaching the terrain surface. This research presents an approach for removing non-ground points from raw ALS data in densely forested areas. Smoothing splines are exploited to interpolate and fit the noisy ALS data. The presented filter utilizes a weight function to allocate weights for each point of the data. Furthermore, unlike most of the methods, the presented filtering algorithm is designed to be automatic. Three different forested areas in the United Kingdom are used to assess the performance of the algorithm. The results show that the generated DTMs from the filtered data are accurate (when compared against reference terrain data) and the performance of the method is stable for all the heavily forested data samples. The average root mean square error (RMSE) value is 0.35 m.

Keywords: airborne laser scanning, digital terrain models, filtering, forested areas

Procedia PDF Downloads 138