Search results for: data harvesting
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24776

Search results for: data harvesting

24266 Dataset Quality Index:Development of Composite Indicator Based on Standard Data Quality Indicators

Authors: Sakda Loetpiparwanich, Preecha Vichitthamaros

Abstract:

Nowadays, poor data quality is considered one of the majority costs for a data project. The data project with data quality awareness almost as much time to data quality processes while data project without data quality awareness negatively impacts financial resources, efficiency, productivity, and credibility. One of the processes that take a long time is defining the expectations and measurements of data quality because the expectation is different up to the purpose of each data project. Especially, big data project that maybe involves with many datasets and stakeholders, that take a long time to discuss and define quality expectations and measurements. Therefore, this study aimed at developing meaningful indicators to describe overall data quality for each dataset to quick comparison and priority. The objectives of this study were to: (1) Develop a practical data quality indicators and measurements, (2) Develop data quality dimensions based on statistical characteristics and (3) Develop Composite Indicator that can describe overall data quality for each dataset. The sample consisted of more than 500 datasets from public sources obtained by random sampling. After datasets were collected, there are five steps to develop the Dataset Quality Index (SDQI). First, we define standard data quality expectations. Second, we find any indicators that can measure directly to data within datasets. Thirdly, each indicator aggregates to dimension using factor analysis. Next, the indicators and dimensions were weighted by an effort for data preparing process and usability. Finally, the dimensions aggregate to Composite Indicator. The results of these analyses showed that: (1) The developed useful indicators and measurements contained ten indicators. (2) the developed data quality dimension based on statistical characteristics, we found that ten indicators can be reduced to 4 dimensions. (3) The developed Composite Indicator, we found that the SDQI can describe overall datasets quality of each dataset and can separate into 3 Level as Good Quality, Acceptable Quality, and Poor Quality. The conclusion, the SDQI provide an overall description of data quality within datasets and meaningful composition. We can use SQDI to assess for all data in the data project, effort estimation, and priority. The SDQI also work well with Agile Method by using SDQI to assessment in the first sprint. After passing the initial evaluation, we can add more specific data quality indicators into the next sprint.

Keywords: data quality, dataset quality, data quality management, composite indicator, factor analysis, principal component analysis

Procedia PDF Downloads 119
24265 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm

Authors: Ameur Abdelkader, Abed Bouarfa Hafida

Abstract:

Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.

Keywords: predictive analysis, big data, predictive analysis algorithms, CART algorithm

Procedia PDF Downloads 127
24264 Canopy Temperature Acquired from Daytime and Nighttime Aerial Data as an Indicator of Trees’ Health Status

Authors: Agata Zakrzewska, Dominik Kopeć, Adrian Ochtyra

Abstract:

The growing number of new cameras, sensors, and research methods allow for a broader application of thermal data in remote sensing vegetation studies. The aim of this research was to check whether it is possible to use thermal infrared data with a spectral range (3.6-4.9 μm) obtained during the day and the night to assess the health condition of selected species of deciduous trees in an urban environment. For this purpose, research was carried out in the city center of Warsaw (Poland) in 2020. During the airborne data acquisition, thermal data, laser scanning, and orthophoto map images were collected. Synchronously with airborne data, ground reference data were obtained for 617 studied species (Acer platanoides, Acer pseudoplatanus, Aesculus hippocastanum, Tilia cordata, and Tilia × euchlora) in different health condition states. The results were as follows: (i) healthy trees are cooler than trees in poor condition and dying both in the daytime and nighttime data; (ii) the difference in the canopy temperatures between healthy and dying trees was 1.06oC of mean value on the nighttime data and 3.28oC of mean value on the daytime data; (iii) condition classes significantly differentiate on both daytime and nighttime thermal data, but only on daytime data all condition classes differed statistically significantly from each other. In conclusion, the aerial thermal data can be considered as an alternative to hyperspectral data, a method of assessing the health condition of trees in an urban environment. Especially data obtained during the day, which can differentiate condition classes better than data obtained at night. The method based on thermal infrared and laser scanning data fusion could be a quick and efficient solution for identifying trees in poor health that should be visually checked in the field.

Keywords: middle wave infrared, thermal imagery, tree discoloration, urban trees

Procedia PDF Downloads 99
24263 Onion Storage and the Roof Influence in the Tropics

Authors: O. B. Imoukhuede, M. O. Ale

Abstract:

The periodic scarcity of onion requires an urgent solution in Nigerian agro- economy. The high percentage of onion losses incurred after the harvesting period is due to non-availability of appropriate facility for its storage. Therefore, some storage structures were constructed with different roofing materials. The response of the materials to the weather parameters like temperature and relative humidity were evaluated to know their effects on the performance of the storage structures. The temperature and relative humidity were taken three times daily alongside with the weight of the onion in each of the structures; the losses as indicated by loss indices like shrinkage, rottenness, sprouting, and colour were identified and percentage loss per week determined. The highest mean percentage loss (22%) was observed in the structure with iron roofing materials while structure with thatched materials had the lowest (9.4%); The highest temperature was observed in the structure with Asbestos roofing materials and no significant difference in the temperature value in the structure with thatched and Iron materials; highest relatively humidity was found in Asbestos roofing material while the lowest in the structure with iron matetrials. It was conclusively found that the storage structure with thatched roof had the best performance in terms of losses.

Keywords: Nigeria, onion, storage structures, weather parameters, roof materials, losses

Procedia PDF Downloads 538
24262 Hierarchical Clustering Algorithms in Data Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

Clustering is a process of grouping objects and data into groups of clusters to ensure that data objects from the same cluster are identical to each other. Clustering algorithms in one of the areas in data mining and it can be classified into partition, hierarchical, density based, and grid-based. Therefore, in this paper, we do a survey and review for four major hierarchical clustering algorithms called CURE, ROCK, CHAMELEON, and BIRCH. The obtained state of the art of these algorithms will help in eliminating the current problems, as well as deriving more robust and scalable algorithms for clustering.

Keywords: clustering, unsupervised learning, algorithms, hierarchical

Procedia PDF Downloads 862
24261 End to End Monitoring in Oracle Fusion Middleware for Data Verification

Authors: Syed Kashif Ali, Usman Javaid, Abdullah Chohan

Abstract:

In large enterprises multiple departments use different sort of information systems and databases according to their needs. These systems are independent and heterogeneous in nature and sharing information/data between these systems is not an easy task. The usage of middleware technologies have made data sharing between systems very easy. However, monitoring the exchange of data/information for verification purposes between target and source systems is often complex or impossible for maintenance department due to security/access privileges on target and source systems. In this paper, we are intended to present our experience of an end to end data monitoring approach at middle ware level implemented in Oracle BPEL for data verification without any help of monitoring tool.

Keywords: service level agreement, SOA, BPEL, oracle fusion middleware, web service monitoring

Procedia PDF Downloads 462
24260 Dissimilarity Measure for General Histogram Data and Its Application to Hierarchical Clustering

Authors: K. Umbleja, M. Ichino

Abstract:

Symbolic data mining has been developed to analyze data in very large datasets. It is also useful in cases when entry specific details should remain hidden. Symbolic data mining is quickly gaining popularity as datasets in need of analyzing are becoming ever larger. One type of such symbolic data is a histogram, which enables to save huge amounts of information into a single variable with high-level of granularity. Other types of symbolic data can also be described in histograms, therefore making histogram a very important and general symbolic data type - a method developed for histograms - can also be applied to other types of symbolic data. Due to its complex structure, analyzing histograms is complicated. This paper proposes a method, which allows to compare two histogram-valued variables and therefore find a dissimilarity between two histograms. Proposed method uses the Ichino-Yaguchi dissimilarity measure for mixed feature-type data analysis as a base and develops a dissimilarity measure specifically for histogram data, which allows to compare histograms with different number of bins and bin widths (so called general histogram). Proposed dissimilarity measure is then used as a measure for clustering. Furthermore, linkage method based on weighted averages is proposed with the concept of cluster compactness to measure the quality of clustering. The method is then validated with application on real datasets. As a result, the proposed dissimilarity measure is found producing adequate and comparable results with general histograms without the loss of detail or need to transform the data.

Keywords: dissimilarity measure, hierarchical clustering, histograms, symbolic data analysis

Procedia PDF Downloads 148
24259 Climate Adaptive Building Shells for Plus-Energy-Buildings, Designed on Bionic Principles

Authors: Andreas Hammer

Abstract:

Six peculiar architecture designs from the Frankfurt University will be discussed within this paper and their future potential of the adaptable and solar thin-film sheets implemented facades will be shown acting and reacting on climate/solar changes of their specific sites. The different aspects, as well as limitations with regard to technical and functional restrictions, will be named. The design process for a “multi-purpose building”, a “high-rise building refurbishment” and a “biker’s lodge” on the river Rheine valley, has been critically outlined and developed step by step from an international studentship towards an overall energy strategy, that firstly had to push the design to a plus-energy building and secondly had to incorporate bionic aspects into the building skins design. Both main parameters needed to be reviewed and refined during the whole design process. Various basic bionic approaches have been given [e.g. solar ivyᵀᴹ, flectofinᵀᴹ or hygroskinᵀᴹ, which were to experiment with, regarding the use of bendable photovoltaic thin film elements being parts of a hybrid, kinetic façade system.

Keywords: bionic and bioclimatic design, climate adaptive building shells [CABS], energy-strategy, harvesting façade, high-efficiency building skin, photovoltaic in building skins, plus-energy-buildings, solar gain, sustainable building concept

Procedia PDF Downloads 409
24258 WiFi Data Offloading: Bundling Method in a Canvas Business Model

Authors: Majid Mokhtarnia, Alireza Amini

Abstract:

Mobile operators deal with increasing in the data traffic as a critical issue. As a result, a vital responsibility of the operators is to deal with such a trend in order to create added values. This paper addresses a bundling method in a Canvas business model in a WiFi Data Offloading (WDO) strategy by which some elements of the model may be affected. In the proposed method, it is supposed to sell a number of data packages for subscribers in which there are some packages with a free given volume of data-offloaded WiFi complimentary. The paper on hands analyses this method in the views of attractiveness and profitability. The results demonstrate that the quality of implementation of the WDO strongly affects the final result and helps the decision maker to make the best one.

Keywords: bundling, canvas business model, telecommunication, WiFi data offloading

Procedia PDF Downloads 178
24257 Distributed Perceptually Important Point Identification for Time Series Data Mining

Authors: Tak-Chung Fu, Ying-Kit Hung, Fu-Lai Chung

Abstract:

In the field of time series data mining, the concept of the Perceptually Important Point (PIP) identification process is first introduced in 2001. This process originally works for financial time series pattern matching and it is then found suitable for time series dimensionality reduction and representation. Its strength is on preserving the overall shape of the time series by identifying the salient points in it. With the rise of Big Data, time series data contributes a major proportion, especially on the data which generates by sensors in the Internet of Things (IoT) environment. According to the nature of PIP identification and the successful cases, it is worth to further explore the opportunity to apply PIP in time series ‘Big Data’. However, the performance of PIP identification is always considered as the limitation when dealing with ‘Big’ time series data. In this paper, two distributed versions of PIP identification based on the Specialized Binary (SB) Tree are proposed. The proposed approaches solve the bottleneck when running the PIP identification process in a standalone computer. Improvement in term of speed is obtained by the distributed versions.

Keywords: distributed computing, performance analysis, Perceptually Important Point identification, time series data mining

Procedia PDF Downloads 414
24256 Analysing Techniques for Fusing Multimodal Data in Predictive Scenarios Using Convolutional Neural Networks

Authors: Philipp Ruf, Massiwa Chabbi, Christoph Reich, Djaffar Ould-Abdeslam

Abstract:

In recent years, convolutional neural networks (CNN) have demonstrated high performance in image analysis, but oftentimes, there is only structured data available regarding a specific problem. By interpreting structured data as images, CNNs can effectively learn and extract valuable insights from tabular data, leading to improved predictive accuracy and uncovering hidden patterns that may not be apparent in traditional structured data analysis. In applying a single neural network for analyzing multimodal data, e.g., both structured and unstructured information, significant advantages in terms of time complexity and energy efficiency can be achieved. Converting structured data into images and merging them with existing visual material offers a promising solution for applying CNN in multimodal datasets, as they often occur in a medical context. By employing suitable preprocessing techniques, structured data is transformed into image representations, where the respective features are expressed as different formations of colors and shapes. In an additional step, these representations are fused with existing images to incorporate both types of information. This final image is finally analyzed using a CNN.

Keywords: CNN, image processing, tabular data, mixed dataset, data transformation, multimodal fusion

Procedia PDF Downloads 99
24255 Knowledge Discovery and Data Mining Techniques in Textile Industry

Authors: Filiz Ersoz, Taner Ersoz, Erkin Guler

Abstract:

This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.

Keywords: data mining, textile production, decision trees, classification

Procedia PDF Downloads 331
24254 ARGO: An Open Designed Unmanned Surface Vehicle Mapping Autonomous Platform

Authors: Papakonstantinou Apostolos, Argyrios Moustakas, Panagiotis Zervos, Dimitrios Stefanakis, Manolis Tsapakis, Nektarios Spyridakis, Mary Paspaliari, Christos Kontos, Antonis Legakis, Sarantis Houzouris, Konstantinos Topouzelis

Abstract:

For years unmanned and remotely operated robots have been used as tools in industry research and education. The rapid development and miniaturization of sensors that can be attached to remotely operated vehicles in recent years allowed industry leaders and researchers to utilize them as an affordable means for data acquisition in air, land, and sea. Despite the recent developments in the ground and unmanned airborne vehicles, a small number of Unmanned Surface Vehicle (USV) platforms are targeted for mapping and monitoring environmental parameters for research and industry purposes. The ARGO project is developed an open-design USV equipped with multi-level control hardware architecture and state-of-the-art sensors and payloads for the autonomous monitoring of environmental parameters in large sea areas. The proposed USV is a catamaran-type USV controlled over a wireless radio link (5G) for long-range mapping capabilities and control for a ground-based control station. The ARGO USV has a propulsion control using 2x fully redundant electric trolling motors with active vector thrust for omnidirectional movement, navigation with opensource autopilot system with high accuracy GNSS device, and communication with the 2.4Ghz digital link able to provide 20km of Line of Sight (Los) range distance. The 3-meter dual hull design and composite structure offer well above 80kg of usable payload capacity. Furthermore, sun and friction energy harvesting methods provide clean energy to the propulsion system. The design is highly modular, where each component or payload can be replaced or modified according to the desired task (industrial or research). The system can be equipped with Multiparameter Sonde, measuring up to 20 water parameters simultaneously, such as conductivity, salinity, turbidity, dissolved oxygen, etc. Furthermore, a high-end multibeam echo sounder can be installed in a specific boat datum for shallow water high-resolution seabed mapping. The system is designed to operate in the Aegean Sea. The developed USV is planned to be utilized as a system for autonomous data acquisition, mapping, and monitoring bathymetry and various environmental parameters. ARGO USV can operate in small or large ports with high maneuverability and endurance to map large geographical extends at sea. The system presents state of the art solutions in the following areas i) the on-board/real-time data processing/analysis capabilities, ii) the energy-independent and environmentally friendly platform entirely made using the latest aeronautical and marine materials, iii) the integration of advanced technology sensors, all in one system (photogrammetric and radiometric footprint, as well as its connection with various environmental and inertial sensors) and iv) the information management application. The ARGO web-based application enables the system to depict the results of the data acquisition process in near real-time. All the recorded environmental variables and indices are presented, allowing users to remotely access all the raw and processed information using the implemented web-based GIS application.

Keywords: monitor marine environment, unmanned surface vehicle, mapping bythometry, sea environmental monitoring

Procedia PDF Downloads 111
24253 Simulation-Based Optimization of a Non-Uniform Piezoelectric Energy Harvester with Stack Boundary

Authors: Alireza Keshmiri, Shahriar Bagheri, Nan Wu

Abstract:

This research presents an analytical model for the development of an energy harvester with piezoelectric rings stacked at the boundary of the structure based on the Adomian decomposition method. The model is applied to geometrically non-uniform beams to derive the steady-state dynamic response of the structure subjected to base motion excitation and efficiently harvest the subsequent vibrational energy. The in-plane polarization of the piezoelectric rings is employed to enhance the electrical power output. A parametric study for the proposed energy harvester with various design parameters is done to prepare the dataset required for optimization. Finally, simulation-based optimization technique helps to find the optimum structural design with maximum efficiency. To solve the optimization problem, an artificial neural network is first trained to replace the simulation model, and then, a genetic algorithm is employed to find the optimized design variables. Higher geometrical non-uniformity and length of the beam lowers the structure natural frequency and generates a larger power output.

Keywords: piezoelectricity, energy harvesting, simulation-based optimization, artificial neural network, genetic algorithm

Procedia PDF Downloads 107
24252 A Comparative Study on Indian and Greek Cotton Fiber Properties Correlations

Authors: Md. Nakib Ul Hasan, Md. Ariful Islam, Md. Sumon Miah, Misbah Ul Hoque, Bulbul Ahmed

Abstract:

The variability of cotton fiber characteristics has always been influenced by origin, weather conditions, method of culturing, and harvesting. Spinners work tirelessly to ensure consistent yarn quality by using the different origins of fibers to maximizes the profit margin. Spinners often fail to select desired raw materials of various origins to achieve an appropriate mixing plan due to the lack of knowledge on the interrelationship among fiber properties. The purpose of this research is to investigate the correlations among dominating fiber properties such as micronaire, strength, breaking elongation, upper half mean length, length uniformity index, short fiber index, maturity, reflectance, and yellowness. For this purpose, fiber samples from 500 Indian cotton bales and 350 Greek cotton bales were collected and tested using the high volume instrument (HVI). The fiber properties dataset was then compiled and analyzed using python 3.7 to determine the correlations matrix. Results show that Indian cotton fiber have highest correlation between strength-mat = 0.84, followed by SFI-Unf =-0.83, and Neps-Unf = -0.72. Greek cotton fiber, in contrast, have highest correlation between SFI-Unf =-0.98, followed by SFI-Mat = 0.89, +b-Len = 0.84, and Str-Mat = 0.74. Overall, the Greek cotton fiber showed a higher correlational matrix than compared to that of Indian cotton fiber.

Keywords: cotton fiber, fiber properties correlation, Greek cotton, HVI, Indian cotton, spinning

Procedia PDF Downloads 143
24251 Investigation of Delivery of Triple Play Data in GE-PON Fiber to the Home Network

Authors: Ashima Anurag Sharma

Abstract:

Optical fiber based networks can deliver performance that can support the increasing demands for high speed connections. One of the new technologies that have emerged in recent years is Passive Optical Networks. This research paper is targeted to show the simultaneous delivery of triple play service (data, voice, and video). The comparison between various data rates is presented. It is demonstrated that as we increase the data rate, number of users to be decreases due to increase in bit error rate.

Keywords: BER, PON, TDMPON, GPON, CWDM, OLT, ONT

Procedia PDF Downloads 510
24250 Microarray Gene Expression Data Dimensionality Reduction Using PCA

Authors: Fuad M. Alkoot

Abstract:

Different experimental technologies such as microarray sequencing have been proposed to generate high-resolution genetic data, in order to understand the complex dynamic interactions between complex diseases and the biological system components of genes and gene products. However, the generated samples have a very large dimension reaching thousands. Therefore, hindering all attempts to design a classifier system that can identify diseases based on such data. Additionally, the high overlap in the class distributions makes the task more difficult. The data we experiment with is generated for the identification of autism. It includes 142 samples, which is small compared to the large dimension of the data. The classifier systems trained on this data yield very low classification rates that are almost equivalent to a guess. We aim at reducing the data dimension and improve it for classification. Here, we experiment with applying a multistage PCA on the genetic data to reduce its dimensionality. Results show a significant improvement in the classification rates which increases the possibility of building an automated system for autism detection.

Keywords: PCA, gene expression, dimensionality reduction, classification, autism

Procedia PDF Downloads 540
24249 Preharvest and Postharvest Factors Influencing Resveratrol, Myricetin and Quercetin Content of Wine

Authors: Mariam Khomasuridze, Nino Chkhartishvili, Irma Chanturia

Abstract:

The influence of preharvest and postharvest factors on resveratrol, myricetin and quercetin content of wine was studied during the experiment. The content of cis and trans resveratrol, myricetin and quercetin were analyzed by HPLC. In frame of experiment, the various factors affecting on wine composition were researched: variety, climate, viticulture practices, grape maturity, harvesting methods and wine making techniques. The results have shown that varietal potential and amount of yield play the most important role in formation of antioxidant compounds. Based on achieved results, the usage of medium roast oak chips protects resveratrol, myricetin, and quercetin from coagulation and precipitation. Compared to the control samples, the wines, produced by addition of oak chips were approximately four times richer with these antioxidant compounds. The retention of resveratrol was lowered with 45 % in wines, producing in Qvevri by Georgian traditional technology without controlling temperature during fermentation. The opposite effects in case of myricetin, quercetin and total phenolics content were determined. Their concentrations were higher with 56-78%, then in the fermented tank at 22 -25 °C. As the result of the experiment, the optimal technology scheme of wine was worked out, reached by biologically active compounds: resveratrol, myricetin, and quercetin.

Keywords: resveratrol, miricetin, quercetin, wine

Procedia PDF Downloads 167
24248 Numerical and Experimental Investigation of a Mechanical System with a Pendulum

Authors: Andrzej Mitura, Krzysztof Kecik, Michal Augustyniak

Abstract:

This paper presents a numerical and experimental research of a nonlinear two degrees of freedom system. The tested system consists of a mechanical oscillator (the primary subsystem) with the attached pendulum (the secondary subsystem). The oscillator is suspended on a linear (or nonlinear) coil spring and a nonlinear magnetorheorogical damper and it is excited kinematically. Added pendulum can be used to reduce vibration of a primary subsystem or to energy harvesting. The numerical and experimental investigations showed that the pendulum can perform several types of motion, for example: chaotic motion, constant position in lower or upper (stable inverted pendulum), rotation, symmetrical or asymmetrical swinging vibrations. The main objective of this study is to determine an influence of system parameters for increasing the zone when the pendulum rotates. As a final effect a semi-active control method to change the pendulum solution on the rotation is proposed. To the implementation of this method the magnetorheorogical damper is applied. Continuous rotation of the pendulum is desirable for recovery of energy. The work is financed by Grant no. 0234/IP2/2011/71 from the Polish Ministry of Science and Higher Education in years 2012-2014.

Keywords: autoparametric vibrations, chaos and rotation control, magnetorheological damper

Procedia PDF Downloads 362
24247 Data Science-Based Key Factor Analysis and Risk Prediction of Diabetic

Authors: Fei Gao, Rodolfo C. Raga Jr.

Abstract:

This research proposal will ascertain the major risk factors for diabetes and to design a predictive model for risk assessment. The project aims to improve diabetes early detection and management by utilizing data science techniques, which may improve patient outcomes and healthcare efficiency. The phase relation values of each attribute were used to analyze and choose the attributes that might influence the examiner's survival probability using Diabetes Health Indicators Dataset from Kaggle’s data as the research data. We compare and evaluate eight machine learning algorithms. Our investigation begins with comprehensive data preprocessing, including feature engineering and dimensionality reduction, aimed at enhancing data quality. The dataset, comprising health indicators and medical data, serves as a foundation for training and testing these algorithms. A rigorous cross-validation process is applied, and we assess their performance using five key metrics like accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). After analyzing the data characteristics, investigate their impact on the likelihood of diabetes and develop corresponding risk indicators.

Keywords: diabetes, risk factors, predictive model, risk assessment, data science techniques, early detection, data analysis, Kaggle

Procedia PDF Downloads 54
24246 A Method for Harvesting Atmospheric Lightning-Energy and Utilization of Extra Generated Power of Nuclear Power Plants during the Low Energy Demand Periods

Authors: Akbar Rahmani Nejad, Pejman Rahmani Nejad, Ahmad Rahmani Nejad

Abstract:

we proposed the arresting of atmospheric lightning and passing the electrical current of lightning-bolts through underground water tanks to produce Hydrogen and restoring Hydrogen in reservoirs to be used later as clean and sustainable energy. It is proposed to implement this method for storage of extra electrical power (instead of lightning energy) during low energy demand periods to produce hydrogen as a clean energy source to store in big reservoirs and later generate electricity by burning the stored hydrogen at an appropriate time. This method prevents the complicated process of changing the output power of nuclear power plants. It is possible to pass an electric current through sodium chloride solution to produce chlorine and sodium or human waste to produce Methane, etc. however atmospheric lightning is an accidental phenomenon, but using this free energy just by connecting the output of lightning arresters to the output of power plant during low energy demand period which there is no significant change in the design of power plant or have no cost, can be considered completely an economical design

Keywords: hydrogen gas, lightning energy, power plant, resistive element

Procedia PDF Downloads 119
24245 A Methodology to Integrate Data in the Company Based on the Semantic Standard in the Context of Industry 4.0

Authors: Chang Qin, Daham Mustafa, Abderrahmane Khiat, Pierre Bienert, Paulo Zanini

Abstract:

Nowadays, companies are facing lots of challenges in the process of digital transformation, which can be a complex and costly undertaking. Digital transformation involves the collection and analysis of large amounts of data, which can create challenges around data management and governance. Furthermore, it is also challenged to integrate data from multiple systems and technologies. Although with these pains, companies are still pursuing digitalization because by embracing advanced technologies, companies can improve efficiency, quality, decision-making, and customer experience while also creating different business models and revenue streams. In this paper, the issue that data is stored in data silos with different schema and structures is focused. The conventional approaches to addressing this issue involve utilizing data warehousing, data integration tools, data standardization, and business intelligence tools. However, these approaches primarily focus on the grammar and structure of the data and neglect the importance of semantic modeling and semantic standardization, which are essential for achieving data interoperability. In this session, the challenge of data silos in Industry 4.0 is addressed by developing a semantic modeling approach compliant with Asset Administration Shell (AAS) models as an efficient standard for communication in Industry 4.0. The paper highlights how our approach can facilitate the data mapping process and semantic lifting according to existing industry standards such as ECLASS and other industrial dictionaries. It also incorporates the Asset Administration Shell technology to model and map the company’s data and utilize a knowledge graph for data storage and exploration.

Keywords: data interoperability in industry 4.0, digital integration, industrial dictionary, semantic modeling

Procedia PDF Downloads 78
24244 Developing Allometric Equations for More Accurate Aboveground Biomass and Carbon Estimation in Secondary Evergreen Forests, Thailand

Authors: Titinan Pothong, Prasit Wangpakapattanawong, Stephen Elliott

Abstract:

Shifting cultivation is an indigenous agricultural practice among upland people and has long been one of the major land-use systems in Southeast Asia. As a result, fallows and secondary forests have come to cover a large part of the region. However, they are increasingly being replaced by monocultures, such as corn cultivation. This is believed to be a main driver of deforestation and forest degradation, and one of the reasons behind the recurring winter smog crisis in Thailand and around Southeast Asia. Accurate biomass estimation of trees is important to quantify valuable carbon stocks and changes to these stocks in case of land use change. However, presently, Thailand lacks proper tools and optimal equations to quantify its carbon stocks, especially for secondary evergreen forests, including fallow areas after shifting cultivation and smaller trees with a diameter at breast height (DBH) of less than 5 cm. Developing new allometric equations to estimate biomass is urgently needed to accurately estimate and manage carbon storage in tropical secondary forests. This study established new equations using a destructive method at three study sites: approximately 50-year-old secondary forest, 4-year-old fallow, and 7-year-old fallow. Tree biomass was collected by harvesting 136 individual trees (including coppiced trees) from 23 species, with a DBH ranging from 1 to 31 cm. Oven-dried samples were sent for carbon analysis. Wood density was calculated from disk samples and samples collected with an increment borer from 79 species, including 35 species currently missing from the Global Wood Densities database. Several models were developed, showing that aboveground biomass (AGB) was strongly related to DBH, height (H), and wood density (WD). Including WD in the model was found to improve the accuracy of the AGB estimation. This study provides insights for reforestation management, and can be used to prepare baseline data for Thailand’s carbon stocks for the REDD+ and other carbon trading schemes. These may provide monetary incentives to stop illegal logging and deforestation for monoculture.

Keywords: aboveground biomass, allometric equation, carbon stock, secondary forest

Procedia PDF Downloads 270
24243 Big Data Analytics and Data Security in the Cloud via Fully Homomorphic Encryption

Authors: Waziri Victor Onomza, John K. Alhassan, Idris Ismaila, Noel Dogonyaro Moses

Abstract:

This paper describes the problem of building secure computational services for encrypted information in the Cloud Computing without decrypting the encrypted data; therefore, it meets the yearning of computational encryption algorithmic aspiration model that could enhance the security of big data for privacy, confidentiality, availability of the users. The cryptographic model applied for the computational process of the encrypted data is the Fully Homomorphic Encryption Scheme. We contribute theoretical presentations in high-level computational processes that are based on number theory and algebra that can easily be integrated and leveraged in the Cloud computing with detail theoretic mathematical concepts to the fully homomorphic encryption models. This contribution enhances the full implementation of big data analytics based cryptographic security algorithm.

Keywords: big data analytics, security, privacy, bootstrapping, homomorphic, homomorphic encryption scheme

Procedia PDF Downloads 364
24242 Protecting Privacy and Data Security in Online Business

Authors: Bilquis Ferdousi

Abstract:

With the exponential growth of the online business, the threat to consumers’ privacy and data security has become a serious challenge. This literature review-based study focuses on a better understanding of those threats and what legislative measures have been taken to address those challenges. Research shows that people are increasingly involved in online business using different digital devices and platforms, although this practice varies based on age groups. The threat to consumers’ privacy and data security is a serious hindrance in developing trust among consumers in online businesses. There are some legislative measures taken at the federal and state level to protect consumers’ privacy and data security. The study was based on an extensive review of current literature on protecting consumers’ privacy and data security and legislative measures that have been taken.

Keywords: privacy, data security, legislation, online business

Procedia PDF Downloads 86
24241 Flowing Online Vehicle GPS Data Clustering Using a New Parallel K-Means Algorithm

Authors: Orhun Vural, Oguz Bayat, Rustu Akay, Osman N. Ucan

Abstract:

This study presents a new parallel approach clustering of GPS data. Evaluation has been made by comparing execution time of various clustering algorithms on GPS data. This paper aims to propose a parallel based on neighborhood K-means algorithm to make it faster. The proposed parallelization approach assumes that each GPS data represents a vehicle and to communicate between vehicles close to each other after vehicles are clustered. This parallelization approach has been examined on different sized continuously changing GPS data and compared with serial K-means algorithm and other serial clustering algorithms. The results demonstrated that proposed parallel K-means algorithm has been shown to work much faster than other clustering algorithms.

Keywords: parallel k-means algorithm, parallel clustering, clustering algorithms, clustering on flowing data

Procedia PDF Downloads 200
24240 An Analysis of Privacy and Security for Internet of Things Applications

Authors: Dhananjay Singh, M. Abdullah-Al-Wadud

Abstract:

The Internet of Things is a concept of a large scale ecosystem of wireless actuators. The actuators are defined as things in the IoT, those which contribute or produces some data to the ecosystem. However, ubiquitous data collection, data security, privacy preserving, large volume data processing, and intelligent analytics are some of the key challenges into the IoT technologies. In order to solve the security requirements, challenges and threats in the IoT, we have discussed a message authentication mechanism for IoT applications. Finally, we have discussed data encryption mechanism for messages authentication before propagating into IoT networks.

Keywords: Internet of Things (IoT), message authentication, privacy, security

Procedia PDF Downloads 360
24239 Effects of Roof Materials on Onion Storage

Authors: Imoukhuede Oladunni Bimpe, Ale Monday Olatunbosun

Abstract:

Periodic scarcity of onion requires urgent solution in Nigerian agro-economy. The high percentage of onion losses incurred after harvesting period is due to non-availability of appropriate facility for its storage. Therefore, some storage structures were constructed with different roofing materials. The response of the materials to the weather parameters like temperature and relative humidity were evaluated to know their effects on the performance of the storage structures. The temperature and relative humidity were taken three times daily alongside with the weight of the onion in each of the structures; the losses as indicated by loss indices like shrinkage, rottenness, sprouting and colour were identified and percentage loss per week determined. The highest mean percentage loss (22%) was observed in the structure with iron roofing materials while structure with thatched materials had the lowest (9.4%); The highest temperature was observed in the structure with Asbestos roofing materials and no significant difference in the temperature value in the structure with thatched and Iron materials; highest relatively humidity was found in Asbestos roofing material while the lowest in the structure with Iron materials. It was conclusively found that the storage structure with thatched roof had the best performance in terms of losses.

Keywords: onion, storage structures, weather parameters, roof materials, losses

Procedia PDF Downloads 587
24238 Cognitive Science Based Scheduling in Grid Environment

Authors: N. D. Iswarya, M. A. Maluk Mohamed, N. Vijaya

Abstract:

Grid is infrastructure that allows the deployment of distributed data in large size from multiple locations to reach a common goal. Scheduling data intensive applications becomes challenging as the size of data sets are very huge in size. Only two solutions exist in order to tackle this challenging issue. First, computation which requires huge data sets to be processed can be transferred to the data site. Second, the required data sets can be transferred to the computation site. In the former scenario, the computation cannot be transferred since the servers are storage/data servers with little or no computational capability. Hence, the second scenario can be considered for further exploration. During scheduling, transferring huge data sets from one site to another site requires more network bandwidth. In order to mitigate this issue, this work focuses on incorporating cognitive science in scheduling. Cognitive Science is the study of human brain and its related activities. Current researches are mainly focused on to incorporate cognitive science in various computational modeling techniques. In this work, the problem solving approach of human brain is studied and incorporated during the data intensive scheduling in grid environments. Here, a cognitive engine is designed and deployed in various grid sites. The intelligent agents present in CE will help in analyzing the request and creating the knowledge base. Depending upon the link capacity, decision will be taken whether to transfer data sets or to partition the data sets. Prediction of next request is made by the agents to serve the requesting site with data sets in advance. This will reduce the data availability time and data transfer time. Replica catalog and Meta data catalog created by the agents assist in decision making process.

Keywords: data grid, grid workflow scheduling, cognitive artificial intelligence

Procedia PDF Downloads 376
24237 Heritage and Tourism in the Era of Big Data: Analysis of Chinese Cultural Tourism in Catalonia

Authors: Xinge Liao, Francesc Xavier Roige Ventura, Dolores Sanchez Aguilera

Abstract:

With the development of the Internet, the study of tourism behavior has rapidly expanded from the traditional physical market to the online market. Data on the Internet is characterized by dynamic changes, and new data appear all the time. In recent years the generation of a large volume of data was characterized, such as forums, blogs, and other sources, which have expanded over time and space, together they constitute large-scale Internet data, known as Big Data. This data of technological origin that derives from the use of devices and the activity of multiple users is becoming a source of great importance for the study of geography and the behavior of tourists. The study will focus on cultural heritage tourist practices in the context of Big Data. The research will focus on exploring the characteristics and behavior of Chinese tourists in relation to the cultural heritage of Catalonia. Geographical information, target image, perceptions in user-generated content will be studied through data analysis from Weibo -the largest social networks of blogs in China. Through the analysis of the behavior of heritage tourists in the Big Data environment, this study will understand the practices (activities, motivations, perceptions) of cultural tourists and then understand the needs and preferences of tourists in order to better guide the sustainable development of tourism in heritage sites.

Keywords: Barcelona, Big Data, Catalonia, cultural heritage, Chinese tourism market, tourists’ behavior

Procedia PDF Downloads 119