Search results for: ground data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26101

Search results for: ground data

24991 Dataset Quality Index:Development of Composite Indicator Based on Standard Data Quality Indicators

Authors: Sakda Loetpiparwanich, Preecha Vichitthamaros

Abstract:

Nowadays, poor data quality is considered one of the majority costs for a data project. The data project with data quality awareness almost as much time to data quality processes while data project without data quality awareness negatively impacts financial resources, efficiency, productivity, and credibility. One of the processes that take a long time is defining the expectations and measurements of data quality because the expectation is different up to the purpose of each data project. Especially, big data project that maybe involves with many datasets and stakeholders, that take a long time to discuss and define quality expectations and measurements. Therefore, this study aimed at developing meaningful indicators to describe overall data quality for each dataset to quick comparison and priority. The objectives of this study were to: (1) Develop a practical data quality indicators and measurements, (2) Develop data quality dimensions based on statistical characteristics and (3) Develop Composite Indicator that can describe overall data quality for each dataset. The sample consisted of more than 500 datasets from public sources obtained by random sampling. After datasets were collected, there are five steps to develop the Dataset Quality Index (SDQI). First, we define standard data quality expectations. Second, we find any indicators that can measure directly to data within datasets. Thirdly, each indicator aggregates to dimension using factor analysis. Next, the indicators and dimensions were weighted by an effort for data preparing process and usability. Finally, the dimensions aggregate to Composite Indicator. The results of these analyses showed that: (1) The developed useful indicators and measurements contained ten indicators. (2) the developed data quality dimension based on statistical characteristics, we found that ten indicators can be reduced to 4 dimensions. (3) The developed Composite Indicator, we found that the SDQI can describe overall datasets quality of each dataset and can separate into 3 Level as Good Quality, Acceptable Quality, and Poor Quality. The conclusion, the SDQI provide an overall description of data quality within datasets and meaningful composition. We can use SQDI to assess for all data in the data project, effort estimation, and priority. The SDQI also work well with Agile Method by using SDQI to assessment in the first sprint. After passing the initial evaluation, we can add more specific data quality indicators into the next sprint.

Keywords: data quality, dataset quality, data quality management, composite indicator, factor analysis, principal component analysis

Procedia PDF Downloads 131
24990 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm

Authors: Ameur Abdelkader, Abed Bouarfa Hafida

Abstract:

Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.

Keywords: predictive analysis, big data, predictive analysis algorithms, CART algorithm

Procedia PDF Downloads 136
24989 Mitigating Acid Mine Drainage Pollution: A Case Study In the Witwatersrand Area of South Africa

Authors: Elkington Sibusiso Mnguni

Abstract:

In South Africa, mining has been a key economic sector since the discovery of gold in 1886 in the Witwatersrand region, where the city of Johannesburg is located. However, some mines have since been decommissioned, and the continuous pumping of acid mine drainage (AMD) also stopped causing the AMD to rise towards the ground surface. This posed a serious environmental risk to the groundwater resources and river systems in the region. This paper documents the development and extent of the environmental damage as well as the measures implemented by the government to alleviate such damage. The study will add to the body of knowledge on the subject of AMD treatment to prevent environmental degradation. The method used to gather and collate relevant data and information was the desktop study. The key findings include the social and environmental impact of the AMD, which include the pollution of water sources for domestic use leading to skin and other health problems and the loss of biodiversity in some areas. It was also found that the technical intervention of constructing a plant to pump and treat the AMD using the high-density sludge technology was the most effective short-term solution available while a long-term solution was being explored. Some successes and challenges experienced during the implementation of the project are also highlighted. The study will be a useful record of the current status of the AMD treatment interventions in the region.

Keywords: acid mine drainage, groundwater resources, pollution, river systems, technical intervention, high density sludge

Procedia PDF Downloads 182
24988 Designing Offshore Pipelines Facing the Geohazard of Active Seismic Faults

Authors: Maria Trimintziou, Michael Sakellariou, Prodromos Psarropoulos

Abstract:

Nowadays, the exploitation of hydrocarbons reserves in deep seas and oceans, in combination with the need to transport hydrocarbons among countries, has made the design, construction and operation of offshore pipelines very significant. Under this perspective, it is evident that many more offshore pipelines are expected to be constructed in the near future. Since offshore pipelines are usually crossing extended areas, they may face a variety of geohazards that impose substantial permanent ground deformations (PGDs) to the pipeline and potentially threaten its integrity. In case of a geohazard area, there exist three options to proceed. The first option is to avoid the problematic area through rerouting, which is usually regarded as an unfavorable solution due to its high cost. The second is to apply (if possible) mitigation/protection measures in order to eliminate the geohazard itself. Finally, the last appealing option is to allow the pipeline crossing through the geohazard area, provided that the pipeline will have been verified against the expected PGDs. In areas with moderate or high seismicity the design of an offshore pipeline is more demanding due to the earthquake-related geohazards, such as landslides, soil liquefaction phenomena, and active faults. It is worthy to mention that although worldwide there is a great experience in offshore geotechnics and pipeline design, the experience in seismic design of offshore pipelines is rather limited due to the fact that most of the pipelines have been constructed in non-seismic regions (e.g. North Sea, West Australia, Gulf of Mexico, etc.). The current study focuses on the seismic design of offshore pipelines against active faults. After an extensive literature review of the provisions of the seismic norms worldwide and of the available analytical methods, the study simulates numerically (through finite-element modeling and strain-based criteria) the distress of offshore pipelines subjected to PGDs induced by active seismic faults at the seabed. Factors, such as the geometrical properties of the fault, the mechanical properties of the ruptured soil formations, and the pipeline characteristics, are examined. After some interesting conclusions regarding the seismic vulnerability of offshore pipelines, potential cost-effective mitigation measures are proposed taking into account constructability issues.

Keywords: offhore pipelines, seismic design, active faults, permanent ground deformations (PGDs)

Procedia PDF Downloads 580
24987 Hierarchical Clustering Algorithms in Data Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

Clustering is a process of grouping objects and data into groups of clusters to ensure that data objects from the same cluster are identical to each other. Clustering algorithms in one of the areas in data mining and it can be classified into partition, hierarchical, density based, and grid-based. Therefore, in this paper, we do a survey and review for four major hierarchical clustering algorithms called CURE, ROCK, CHAMELEON, and BIRCH. The obtained state of the art of these algorithms will help in eliminating the current problems, as well as deriving more robust and scalable algorithms for clustering.

Keywords: clustering, unsupervised learning, algorithms, hierarchical

Procedia PDF Downloads 879
24986 Allometric Models for Biomass Estimation in Savanna Woodland Area, Niger State, Nigeria

Authors: Abdullahi Jibrin, Aishetu Abdulkadir

Abstract:

The development of allometric models is crucial to accurate forest biomass/carbon stock assessment. The aim of this study was to develop a set of biomass prediction models that will enable the determination of total tree aboveground biomass for savannah woodland area in Niger State, Nigeria. Based on the data collected through biometric measurements of 1816 trees and destructive sampling of 36 trees, five species specific and one site specific models were developed. The sample size was distributed equally between the five most dominant species in the study site (Vitellaria paradoxa, Irvingia gabonensis, Parkia biglobosa, Anogeissus leiocarpus, Pterocarpus erinaceous). Firstly, the equations were developed for five individual species. Secondly these five species were mixed and were used to develop an allometric equation of mixed species. Overall, there was a strong positive relationship between total tree biomass and the stem diameter. The coefficient of determination (R2 values) ranging from 0.93 to 0.99 P < 0.001 were realised for the models; with considerable low standard error of the estimates (SEE) which confirms that the total tree above ground biomass has a significant relationship with the dbh. The F-test value for the biomass prediction models were also significant at p < 0.001 which indicates that the biomass prediction models are valid. This study recommends that for improved biomass estimates in the study site, the site specific biomass models should preferably be used instead of using generic models.

Keywords: allometriy, biomass, carbon stock , model, regression equation, woodland, inventory

Procedia PDF Downloads 442
24985 End to End Monitoring in Oracle Fusion Middleware for Data Verification

Authors: Syed Kashif Ali, Usman Javaid, Abdullah Chohan

Abstract:

In large enterprises multiple departments use different sort of information systems and databases according to their needs. These systems are independent and heterogeneous in nature and sharing information/data between these systems is not an easy task. The usage of middleware technologies have made data sharing between systems very easy. However, monitoring the exchange of data/information for verification purposes between target and source systems is often complex or impossible for maintenance department due to security/access privileges on target and source systems. In this paper, we are intended to present our experience of an end to end data monitoring approach at middle ware level implemented in Oracle BPEL for data verification without any help of monitoring tool.

Keywords: service level agreement, SOA, BPEL, oracle fusion middleware, web service monitoring

Procedia PDF Downloads 476
24984 Dissimilarity Measure for General Histogram Data and Its Application to Hierarchical Clustering

Authors: K. Umbleja, M. Ichino

Abstract:

Symbolic data mining has been developed to analyze data in very large datasets. It is also useful in cases when entry specific details should remain hidden. Symbolic data mining is quickly gaining popularity as datasets in need of analyzing are becoming ever larger. One type of such symbolic data is a histogram, which enables to save huge amounts of information into a single variable with high-level of granularity. Other types of symbolic data can also be described in histograms, therefore making histogram a very important and general symbolic data type - a method developed for histograms - can also be applied to other types of symbolic data. Due to its complex structure, analyzing histograms is complicated. This paper proposes a method, which allows to compare two histogram-valued variables and therefore find a dissimilarity between two histograms. Proposed method uses the Ichino-Yaguchi dissimilarity measure for mixed feature-type data analysis as a base and develops a dissimilarity measure specifically for histogram data, which allows to compare histograms with different number of bins and bin widths (so called general histogram). Proposed dissimilarity measure is then used as a measure for clustering. Furthermore, linkage method based on weighted averages is proposed with the concept of cluster compactness to measure the quality of clustering. The method is then validated with application on real datasets. As a result, the proposed dissimilarity measure is found producing adequate and comparable results with general histograms without the loss of detail or need to transform the data.

Keywords: dissimilarity measure, hierarchical clustering, histograms, symbolic data analysis

Procedia PDF Downloads 156
24983 Coordinative Remote Sensing Observation Technology for a High Altitude Barrier Lake

Authors: Zhang Xin

Abstract:

Barrier lakes are lakes formed by storing water in valleys, river valleys or riverbeds after being blocked by landslide, earthquake, debris flow, and other factors. They have great potential safety hazards. When the water is stored to a certain extent, it may burst in case of strong earthquake or rainstorm, and the lake water overflows, resulting in large-scale flood disasters. In order to ensure the safety of people's lives and property in the downstream, it is very necessary to monitor the barrier lake. However, it is very difficult and time-consuming to manually monitor the barrier lake in high altitude areas due to the harsh climate and steep terrain. With the development of earth observation technology, remote sensing monitoring has become one of the main ways to obtain observation data. Compared with a single satellite, multi-satellite remote sensing cooperative observation has more advantages; its spatial coverage is extensive, observation time is continuous, imaging types and bands are abundant, it can monitor and respond quickly to emergencies, and complete complex monitoring tasks. Monitoring with multi-temporal and multi-platform remote sensing satellites can obtain a variety of observation data in time, acquire key information such as water level and water storage capacity of the barrier lake, scientifically judge the situation of the barrier lake and reasonably predict its future development trend. In this study, The Sarez Lake, which formed on February 18, 1911, in the central part of the Pamir as a result of blockage of the Murgab River valley by a landslide triggered by a strong earthquake with magnitude of 7.4 and intensity of 9, is selected as the research area. Since the formation of Lake Sarez, it has aroused widespread international concern about its safety. At present, the use of mechanical methods in the international analysis of the safety of Lake Sarez is more common, and remote sensing methods are seldom used. This study combines remote sensing data with field observation data, and uses the 'space-air-ground' joint observation technology to study the changes in water level and water storage capacity of Lake Sarez in recent decades, and evaluate its safety. The situation of the collapse is simulated, and the future development trend of Lake Sarez is predicted. The results show that: 1) in recent decades, the water level of Lake Sarez has not changed much and remained at a stable level; 2) unless there is a strong earthquake or heavy rain, it is less likely that the Lake Sarez will be broken under normal conditions, 3) lake Sarez will remain stable in the future, but it is necessary to establish an early warning system in the Lake Sarez area for remote sensing of the area, 4) the coordinative remote sensing observation technology is feasible for the high altitude barrier lake of Sarez.

Keywords: coordinative observation, disaster, remote sensing, geographic information system, GIS

Procedia PDF Downloads 118
24982 WiFi Data Offloading: Bundling Method in a Canvas Business Model

Authors: Majid Mokhtarnia, Alireza Amini

Abstract:

Mobile operators deal with increasing in the data traffic as a critical issue. As a result, a vital responsibility of the operators is to deal with such a trend in order to create added values. This paper addresses a bundling method in a Canvas business model in a WiFi Data Offloading (WDO) strategy by which some elements of the model may be affected. In the proposed method, it is supposed to sell a number of data packages for subscribers in which there are some packages with a free given volume of data-offloaded WiFi complimentary. The paper on hands analyses this method in the views of attractiveness and profitability. The results demonstrate that the quality of implementation of the WDO strongly affects the final result and helps the decision maker to make the best one.

Keywords: bundling, canvas business model, telecommunication, WiFi data offloading

Procedia PDF Downloads 193
24981 A Review on Application of Waste Tire in Concrete

Authors: M. A. Yazdi, J. Yang, L. Yihui, H. Su

Abstract:

The application of recycle waste tires into civil engineering practices, namely asphalt paving mixtures and cementbased materials has been gaining ground across the world. This review summarizes and compares the recent achievements in the area of plain rubberized concrete (PRC), in details. Different treatment methods have been discussed to improve the performance of rubberized Portland cement concrete. The review also includes the effects of size and amount of tire rubbers on mechanical and durability properties of PRC. The microstructure behaviour of the rubberized concrete was detailed.

Keywords: waste rubber aggregates, microstructure, treatment methods, size and content effects

Procedia PDF Downloads 322
24980 An Optimal Hybrid EMS System for a Hyperloop Prototype Vehicle

Authors: J. F. Gonzalez-Rojo, Federico Lluesma-Rodriguez, Temoatzin Gonzalez

Abstract:

Hyperloop, a new mode of transport, is gaining significance. It consists of the use of a ground-based transport system which includes a levitation system, that avoids rolling friction forces, and which has been covered with a tube, controlling the inner atmosphere lowering the aerodynamic drag forces. Thus, hyperloop is proposed as a solution to the current limitation on ground transportation. Rolling and aerodynamic problems, that limit large speeds for traditional high-speed rail or even maglev systems, are overcome using a hyperloop solution. Zeleros is one of the companies developing technology for hyperloop application worldwide. It is working on a concept that reduces the infrastructure cost and minimizes the power consumption as well as the losses associated with magnetic drag forces. For this purpose, Zeleros proposes a Hybrid ElectroMagnetic Suspension (EMS) for its prototype. In the present manuscript an active and optimal electromagnetic suspension levitation method based on nearly zero power consumption individual modules is presented. This system consists of several hybrid permanent magnet-coil levitation units that can be arranged along the vehicle. The proposed unit manages to redirect the magnetic field along a defined direction forming a magnetic circuit and minimizing the loses due to field dispersion. This is achieved using an electrical steel core. Each module can stabilize the gap distance using the coil current and either linear or non-linear control methods. The ratio between weight and levitation force for each unit is 1/10. In addition, the quotient between the lifted weight and power consumption at the target gap distance is 1/3 [kg/W]. One degree of freedom (DoF) (along the gap direction) is controlled by a single unit. However, when several units are present, a 5 DoF control (2 translational and 3 rotational) can be achieved, leading to the full attitude control of the vehicle. The proposed system has been successfully tested reaching TRL-4 in a laboratory test bench and is currently in TRL-5 state development if the module association in order to control 5 DoF is considered.

Keywords: active optimal control, electromagnetic levitation, HEMS, high-speed transport, hyperloop

Procedia PDF Downloads 141
24979 Distributed Perceptually Important Point Identification for Time Series Data Mining

Authors: Tak-Chung Fu, Ying-Kit Hung, Fu-Lai Chung

Abstract:

In the field of time series data mining, the concept of the Perceptually Important Point (PIP) identification process is first introduced in 2001. This process originally works for financial time series pattern matching and it is then found suitable for time series dimensionality reduction and representation. Its strength is on preserving the overall shape of the time series by identifying the salient points in it. With the rise of Big Data, time series data contributes a major proportion, especially on the data which generates by sensors in the Internet of Things (IoT) environment. According to the nature of PIP identification and the successful cases, it is worth to further explore the opportunity to apply PIP in time series ‘Big Data’. However, the performance of PIP identification is always considered as the limitation when dealing with ‘Big’ time series data. In this paper, two distributed versions of PIP identification based on the Specialized Binary (SB) Tree are proposed. The proposed approaches solve the bottleneck when running the PIP identification process in a standalone computer. Improvement in term of speed is obtained by the distributed versions.

Keywords: distributed computing, performance analysis, Perceptually Important Point identification, time series data mining

Procedia PDF Downloads 426
24978 Estimating Poverty Levels from Satellite Imagery: A Comparison of Human Readers and an Artificial Intelligence Model

Authors: Ola Hall, Ibrahim Wahab, Thorsteinn Rognvaldsson, Mattias Ohlsson

Abstract:

The subfield of poverty and welfare estimation that applies machine learning tools and methods on satellite imagery is a nascent but rapidly growing one. This is in part driven by the sustainable development goal, whose overarching principle is that no region is left behind. Among other things, this requires that welfare levels can be accurately and rapidly estimated at different spatial scales and resolutions. Conventional tools of household surveys and interviews do not suffice in this regard. While they are useful for gaining a longitudinal understanding of the welfare levels of populations, they do not offer adequate spatial coverage for the accuracy that is needed, nor are their implementation sufficiently swift to gain an accurate insight into people and places. It is this void that satellite imagery fills. Previously, this was near-impossible to implement due to the sheer volume of data that needed processing. Recent advances in machine learning, especially the deep learning subtype, such as deep neural networks, have made this a rapidly growing area of scholarship. Despite their unprecedented levels of performance, such models lack transparency and explainability and thus have seen limited downstream applications as humans generally are apprehensive of techniques that are not inherently interpretable and trustworthy. While several studies have demonstrated the superhuman performance of AI models, none has directly compared the performance of such models and human readers in the domain of poverty studies. In the present study, we directly compare the performance of human readers and a DL model using different resolutions of satellite imagery to estimate the welfare levels of demographic and health survey clusters in Tanzania, using the wealth quintile ratings from the same survey as the ground truth data. The cluster-level imagery covers all 608 cluster locations, of which 428 were classified as rural. The imagery for the human readers was sourced from the Google Maps Platform at an ultra-high resolution of 0.6m per pixel at zoom level 18, while that of the machine learning model was sourced from the comparatively lower resolution Sentinel-2 10m per pixel data for the same cluster locations. Rank correlation coefficients of between 0.31 and 0.32 achieved by the human readers were much lower when compared to those attained by the machine learning model – 0.69-0.79. This superhuman performance by the model is even more significant given that it was trained on the relatively lower 10-meter resolution satellite data while the human readers estimated welfare levels from the higher 0.6m spatial resolution data from which key markers of poverty and slums – roofing and road quality – are discernible. It is important to note, however, that the human readers did not receive any training before ratings, and had this been done, their performance might have improved. The stellar performance of the model also comes with the inevitable shortfall relating to limited transparency and explainability. The findings have significant implications for attaining the objective of the current frontier of deep learning models in this domain of scholarship – eXplainable Artificial Intelligence through a collaborative rather than a comparative framework.

Keywords: poverty prediction, satellite imagery, human readers, machine learning, Tanzania

Procedia PDF Downloads 99
24977 Analysing Techniques for Fusing Multimodal Data in Predictive Scenarios Using Convolutional Neural Networks

Authors: Philipp Ruf, Massiwa Chabbi, Christoph Reich, Djaffar Ould-Abdeslam

Abstract:

In recent years, convolutional neural networks (CNN) have demonstrated high performance in image analysis, but oftentimes, there is only structured data available regarding a specific problem. By interpreting structured data as images, CNNs can effectively learn and extract valuable insights from tabular data, leading to improved predictive accuracy and uncovering hidden patterns that may not be apparent in traditional structured data analysis. In applying a single neural network for analyzing multimodal data, e.g., both structured and unstructured information, significant advantages in terms of time complexity and energy efficiency can be achieved. Converting structured data into images and merging them with existing visual material offers a promising solution for applying CNN in multimodal datasets, as they often occur in a medical context. By employing suitable preprocessing techniques, structured data is transformed into image representations, where the respective features are expressed as different formations of colors and shapes. In an additional step, these representations are fused with existing images to incorporate both types of information. This final image is finally analyzed using a CNN.

Keywords: CNN, image processing, tabular data, mixed dataset, data transformation, multimodal fusion

Procedia PDF Downloads 113
24976 Knowledge Discovery and Data Mining Techniques in Textile Industry

Authors: Filiz Ersoz, Taner Ersoz, Erkin Guler

Abstract:

This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.

Keywords: data mining, textile production, decision trees, classification

Procedia PDF Downloads 344
24975 Theoretical Study of Electronic Structure of Erbium (Er), Fermium (Fm), and Nobelium (No)

Authors: Saleh O. Allehabi, V. A. Dzubaa, V. V. Flambaum, Jiguang Li, A. V. Afanasjev, S. E. Agbemava

Abstract:

Recently developed versions of the configuration method for open shells, configuration interaction with perturbation theory (CIPT), and configuration interaction with many-body perturbation theory (CI+MBPT) techniques are used to study the electronic structure of Er, Fm, and No atoms. Excitation energies of odd states connected to the even ground state by electric dipole transitions, the corresponding transition rates, isotope shift, hyperfine structure, ionization potentials, and static scalar polarizabilities are calculated. The way of extracting parameters of nuclear charge distribution beyond nuclear root mean square (RMS) radius, e.g., a parameter of quadrupole deformation β, is demonstrated. In nuclei with spin > 1/2, parameter β is extracted from the quadrupole hyperfine structure. With zero nuclear spin or spin 1/2, it is impossible since quadrupole zero, so a different method was developed. The measurements of at least two atomic transitions are needed to disentangle the contributions of the changes in deformation and nuclear RMS radius into field isotopic shift. This is important for testing nuclear theory and for searching for the hypothetical island of stability. Fm and No are heavy elements approaching the superheavy region, for which the experimental data are very poor, only seven lines for the Fm element and one line for the No element. Since Er and Fm have similar electronic structures, calculations for Er serve as a guide to the accuracy of the calculations. Twenty-eight new levels of Fm atom are reported.

Keywords: atomic spectra, electronic transitions, isotope effect, electron correlation calculations for atoms

Procedia PDF Downloads 150
24974 Investigation of Delivery of Triple Play Data in GE-PON Fiber to the Home Network

Authors: Ashima Anurag Sharma

Abstract:

Optical fiber based networks can deliver performance that can support the increasing demands for high speed connections. One of the new technologies that have emerged in recent years is Passive Optical Networks. This research paper is targeted to show the simultaneous delivery of triple play service (data, voice, and video). The comparison between various data rates is presented. It is demonstrated that as we increase the data rate, number of users to be decreases due to increase in bit error rate.

Keywords: BER, PON, TDMPON, GPON, CWDM, OLT, ONT

Procedia PDF Downloads 522
24973 Microarray Gene Expression Data Dimensionality Reduction Using PCA

Authors: Fuad M. Alkoot

Abstract:

Different experimental technologies such as microarray sequencing have been proposed to generate high-resolution genetic data, in order to understand the complex dynamic interactions between complex diseases and the biological system components of genes and gene products. However, the generated samples have a very large dimension reaching thousands. Therefore, hindering all attempts to design a classifier system that can identify diseases based on such data. Additionally, the high overlap in the class distributions makes the task more difficult. The data we experiment with is generated for the identification of autism. It includes 142 samples, which is small compared to the large dimension of the data. The classifier systems trained on this data yield very low classification rates that are almost equivalent to a guess. We aim at reducing the data dimension and improve it for classification. Here, we experiment with applying a multistage PCA on the genetic data to reduce its dimensionality. Results show a significant improvement in the classification rates which increases the possibility of building an automated system for autism detection.

Keywords: PCA, gene expression, dimensionality reduction, classification, autism

Procedia PDF Downloads 557
24972 Catastrophic Burden and Impoverishment Effect of WASH Diseases: A Ground Analysis of Bhadohi District Uttar Pradesh, India

Authors: Jyoti Pandey, Rajiv Kumar Bhatt

Abstract:

In the absence of proper sanitation, people suffered from high levels of infectious diseases leading to high incidences of morbidity and mortality. This directly affected the ability of a country to maintain an efficient economy and implied great personal suffering among infected individuals and their families. This paper aims to estimate the catastrophic expenditure of households in terms of direct and indirect losses which a person has to face due to the illness of WASH diseases; the severity of the scenario is answered by finding out the impoverishment effect. We used the primary data survey for the objective outlined. Descriptive and analytical research types are used. The survey is done with the questionnaire formulated precisely, taking care of the inclusion of all the variables and probable outcomes. A total of 300 households is covered under this study. In order to pursue the objectives outlined, multistage random sampling of households is used. In this study, the cost of illness approach is followed for accessing economic impact. The study brought out the attention that a significant portion of the total consumption expenditure is going lost for the treatment of water and sanitation related diseases. The infectious and water vector-borne disease can be checked by providing sufficient required sanitation facility, and that 2.02% loss in income can be gained if the mechanisms of the pathogen is checked.

Keywords: water, sanitation, impoverishment, catastrophic expenditure

Procedia PDF Downloads 81
24971 Investigating Underground Explosion-Like Sounds in Sarableh City and Its Possible Connection with Geological Hazards

Authors: Hosein Almasikia

Abstract:

Sarableh City is located in the west of Iran and in the seismic zone of Zagros. After the Azgole-Sarpol Zahab earthquake with a magnitude of 3.7 Richter on November 21, 2016, in some parts of Sarableh city, horrible sounds were heard by people. There is also a sound similar to the wear of the mill by some of the residents. Vibration studies and field investigations showed that these sounds have a geological origin and are emitted from the ground to the surface and may be related to geological hazards such as landslides, collapse of karstic zones, etc. In this study, an attempt has been made to investigate the possible relationship between these abnormal sounds and geological hazards.

Keywords: Sarable, Zagros, landslide, karstic zone

Procedia PDF Downloads 57
24970 Data Science-Based Key Factor Analysis and Risk Prediction of Diabetic

Authors: Fei Gao, Rodolfo C. Raga Jr.

Abstract:

This research proposal will ascertain the major risk factors for diabetes and to design a predictive model for risk assessment. The project aims to improve diabetes early detection and management by utilizing data science techniques, which may improve patient outcomes and healthcare efficiency. The phase relation values of each attribute were used to analyze and choose the attributes that might influence the examiner's survival probability using Diabetes Health Indicators Dataset from Kaggle’s data as the research data. We compare and evaluate eight machine learning algorithms. Our investigation begins with comprehensive data preprocessing, including feature engineering and dimensionality reduction, aimed at enhancing data quality. The dataset, comprising health indicators and medical data, serves as a foundation for training and testing these algorithms. A rigorous cross-validation process is applied, and we assess their performance using five key metrics like accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). After analyzing the data characteristics, investigate their impact on the likelihood of diabetes and develop corresponding risk indicators.

Keywords: diabetes, risk factors, predictive model, risk assessment, data science techniques, early detection, data analysis, Kaggle

Procedia PDF Downloads 66
24969 Superhydrophobic Materials: A Promising Way to Enhance Resilience of Electric System

Authors: M. Balordi, G. Santucci de Magistris, F. Pini, P. Marcacci

Abstract:

The increasing of extreme meteorological events represents the most important causes of damages and blackouts of the whole electric system. In particular, the icing on ground-wires and overheads lines, due to snowstorms or harsh winter conditions, very often gives rise to the collapse of cables and towers both in cold and warm climates. On the other hand, the high concentration of contaminants in the air, due to natural and/or antropic causes, is reflected in high levels of pollutants layered on glass and ceramic insulators, causing frequent and unpredictable flashover events. Overheads line and insulator failures lead to blackouts, dangerous and expensive maintenances and serious inefficiencies in the distribution service. Inducing superhydrophobic (SHP) properties to conductors, ground-wires and insulators, is one of the ways to face all these problems. Indeed, in some cases, the SHP surface can delay the ice nucleation time and decrease the ice nucleation temperature, preventing ice formation. Besides, thanks to the low surface energy, the adhesion force between ice and a superhydrophobic material are low and the ice can be easily detached from the surface. Moreover, it is well known that superhydrophobic surfaces can have self-cleaning properties: these hinder the deposition of pollution and decrease the probability of flashover phenomena. Here this study presents three different studies to impart superhydrophobicity to aluminum, zinc and glass specimens, which represent the main constituent materials of conductors, ground-wires and insulators, respectively. The route to impart the superhydrophobicity to the metallic surfaces can be summarized in a three-step process: 1) sandblasting treatment, 2) chemical-hydrothermal treatment and 3) coating deposition. The first step is required to create a micro-roughness. In the chemical-hydrothermal treatment a nano-scale metallic oxide (Al or Zn) is grown and, together with the sandblasting treatment, bring about a hierarchical micro-nano structure. By coating an alchilated or fluorinated siloxane coating, the surface energy decreases and gives rise to superhydrophobic surfaces. In order to functionalize the glass, different superhydrophobic powders, obtained by a sol-gel synthesis, were prepared. Further, the specimens were covered with a commercial primer and the powders were deposed on them. All the resulting metallic and glass surfaces showed a noticeable superhydrophobic behavior with a very high water contact angles (>150°) and a very low roll-off angles (<5°). The three optimized processes are fast, cheap and safe, and can be easily replicated on industrial scales. The anti-icing and self-cleaning properties of the surfaces were assessed with several indoor lab-tests that evidenced remarkable anti-icing properties and self-cleaning behavior with respect to the bare materials. Finally, to evaluate the anti-snow properties of the samples, some SHP specimens were exposed under real snow-fall events in the RSE outdoor test-facility located in Vinadio, western Alps: the coated samples delay the formation of the snow-sleeves and facilitate the detachment of the snow. The good results for both indoor and outdoor tests make these materials promising for further development in large scale applications.

Keywords: superhydrophobic coatings, anti-icing, self-cleaning, anti-snow, overheads lines

Procedia PDF Downloads 179
24968 Artificial Neural Network and Satellite Derived Chlorophyll Indices for Estimation of Wheat Chlorophyll Content under Rainfed Condition

Authors: Muhammad Naveed Tahir, Wang Yingkuan, Huang Wenjiang, Raheel Osman

Abstract:

Numerous models used in prediction and decision-making process but most of them are linear in natural environment, and linear models reach their limitations with non-linearity in data. Therefore accurate estimation is difficult. Artificial Neural Networks (ANN) found extensive acceptance to address the modeling of the complex real world for the non-linear environment. ANN’s have more general and flexible functional forms than traditional statistical methods can effectively deal with. The link between information technology and agriculture will become more firm in the near future. Monitoring crop biophysical properties non-destructively can provide a rapid and accurate understanding of its response to various environmental influences. Crop chlorophyll content is an important indicator of crop health and therefore the estimation of crop yield. In recent years, remote sensing has been accepted as a robust tool for site-specific management by detecting crop parameters at both local and large scales. The present research combined the ANN model with satellite-derived chlorophyll indices from LANDSAT 8 imagery for predicting real-time wheat chlorophyll estimation. The cloud-free scenes of LANDSAT 8 were acquired (Feb-March 2016-17) at the same time when ground-truthing campaign was performed for chlorophyll estimation by using SPAD-502. Different vegetation indices were derived from LANDSAT 8 imagery using ERADAS Imagine (v.2014) software for chlorophyll determination. The vegetation indices were including Normalized Difference Vegetation Index (NDVI), Green Normalized Difference Vegetation Index (GNDVI), Chlorophyll Absorbed Ratio Index (CARI), Modified Chlorophyll Absorbed Ratio Index (MCARI) and Transformed Chlorophyll Absorbed Ratio index (TCARI). For ANN modeling, MATLAB and SPSS (ANN) tools were used. Multilayer Perceptron (MLP) in MATLAB provided very satisfactory results. For training purpose of MLP 61.7% of the data, for validation purpose 28.3% of data and rest 10% of data were used to evaluate and validate the ANN model results. For error evaluation, sum of squares error and relative error were used. ANN model summery showed that sum of squares error of 10.786, the average overall relative error was .099. The MCARI and NDVI were revealed to be more sensitive indices for assessing wheat chlorophyll content with the highest coefficient of determination R²=0.93 and 0.90 respectively. The results suggested that use of high spatial resolution satellite imagery for the retrieval of crop chlorophyll content by using ANN model provides accurate, reliable assessment of crop health status at a larger scale which can help in managing crop nutrition requirement in real time.

Keywords: ANN, chlorophyll content, chlorophyll indices, satellite images, wheat

Procedia PDF Downloads 143
24967 A Methodology to Integrate Data in the Company Based on the Semantic Standard in the Context of Industry 4.0

Authors: Chang Qin, Daham Mustafa, Abderrahmane Khiat, Pierre Bienert, Paulo Zanini

Abstract:

Nowadays, companies are facing lots of challenges in the process of digital transformation, which can be a complex and costly undertaking. Digital transformation involves the collection and analysis of large amounts of data, which can create challenges around data management and governance. Furthermore, it is also challenged to integrate data from multiple systems and technologies. Although with these pains, companies are still pursuing digitalization because by embracing advanced technologies, companies can improve efficiency, quality, decision-making, and customer experience while also creating different business models and revenue streams. In this paper, the issue that data is stored in data silos with different schema and structures is focused. The conventional approaches to addressing this issue involve utilizing data warehousing, data integration tools, data standardization, and business intelligence tools. However, these approaches primarily focus on the grammar and structure of the data and neglect the importance of semantic modeling and semantic standardization, which are essential for achieving data interoperability. In this session, the challenge of data silos in Industry 4.0 is addressed by developing a semantic modeling approach compliant with Asset Administration Shell (AAS) models as an efficient standard for communication in Industry 4.0. The paper highlights how our approach can facilitate the data mapping process and semantic lifting according to existing industry standards such as ECLASS and other industrial dictionaries. It also incorporates the Asset Administration Shell technology to model and map the company’s data and utilize a knowledge graph for data storage and exploration.

Keywords: data interoperability in industry 4.0, digital integration, industrial dictionary, semantic modeling

Procedia PDF Downloads 89
24966 Antibiogram Profile of Antibacterial Multidrug Resistance in Democratic Republic of Congo: Situation in Bukavu City Hospitals

Authors: Justin Ntokamunda Kadima, Christian Ahadi Irenge, Patient Birindwa Mulashe, Félicien Mushagalusa Kasali, Patient Wimba

Abstract:

Background: Bacterial strains carrying multidrug resistance traits are gaining ground worldwide, especially in countries with limited resources. This study aimed to evaluate the spreading of multidrug-resistant bacteria strains in Bukavu city hospitals in the Democratic Republic of Congo. Methods: We analyzed 758 antibiogram data recorded in files of patients consulted between January 2016 and December 2017 at three reference hospitals selected as sentinel sites, namely the Panzi General Reference Hospital (HGP), BIO -PHARM hospital (HBP), and Saint Luc Clinic (CSL). Results: Of 758 isolates tested, the laboratories identified 12 bacterial strains in 712 isolates, of which 223 (29.42%) presented MDR profile, including Escherichia coli (11.48%), Klebsiella pneumonia (6.07%), Enterobacter (5.8%), Staphylococcus aureus and coagulase-negative Staphylococci (1.58%), Proteus mirabilis (1.85%), Salmonella enterica (1.19%), Pseudomonas aeruginosa (0.53%), Streptococcus pneumonia (0.4%)), Citrobacter (0.13%), Neisseria gonorrhea (0.13%), Enterococcus faecalis (0.13%), and Morganella morganii (0.13%). Infected patients were significantly more adults (73.1% vs. 21.5%) compared to children and mainly women (63.7% vs. 30.9%; p = 0.001). Conclusion: The observed expansion requires that hospital therapeutic committees set up an effective clinical management system and define the right combinations of antibiotics.

Keywords: multidrug resistance, bacteria, antibiogram, Bukavu

Procedia PDF Downloads 74
24965 Big Data Analytics and Data Security in the Cloud via Fully Homomorphic Encryption

Authors: Waziri Victor Onomza, John K. Alhassan, Idris Ismaila, Noel Dogonyaro Moses

Abstract:

This paper describes the problem of building secure computational services for encrypted information in the Cloud Computing without decrypting the encrypted data; therefore, it meets the yearning of computational encryption algorithmic aspiration model that could enhance the security of big data for privacy, confidentiality, availability of the users. The cryptographic model applied for the computational process of the encrypted data is the Fully Homomorphic Encryption Scheme. We contribute theoretical presentations in high-level computational processes that are based on number theory and algebra that can easily be integrated and leveraged in the Cloud computing with detail theoretic mathematical concepts to the fully homomorphic encryption models. This contribution enhances the full implementation of big data analytics based cryptographic security algorithm.

Keywords: big data analytics, security, privacy, bootstrapping, homomorphic, homomorphic encryption scheme

Procedia PDF Downloads 372
24964 DEEPMOTILE: Motility Analysis of Human Spermatozoa Using Deep Learning in Sri Lankan Population

Authors: Chamika Chiran Perera, Dananjaya Perera, Chirath Dasanayake, Banuka Athuraliya

Abstract:

Male infertility is a major problem in the world, and it is a neglected and sensitive health issue in Sri Lanka. It can be determined by analyzing human semen samples. Sperm motility is one of many factors that can evaluate male’s fertility potential. In Sri Lanka, this analysis is performed manually. Manual methods are time consuming and depend on the person, but they are reliable and it can depend on the expert. Machine learning and deep learning technologies are currently being investigated to automate the spermatozoa motility analysis, and these methods are unreliable. These automatic methods tend to produce false positive results and false detection. Current automatic methods support different techniques, and some of them are very expensive. Due to the geographical variance in spermatozoa characteristics, current automatic methods are not reliable for motility analysis in Sri Lanka. The suggested system, DeepMotile, is to explore a method to analyze motility of human spermatozoa automatically and present it to the andrology laboratories to overcome current issues. DeepMotile is a novel deep learning method for analyzing spermatozoa motility parameters in the Sri Lankan population. To implement the current approach, Sri Lanka patient data were collected anonymously as a dataset, and glass slides were used as a low-cost technique to analyze semen samples. Current problem was identified as microscopic object detection and tackling the problem. YOLOv5 was customized and used as the object detector, and it achieved 94 % mAP (mean average precision), 86% Precision, and 90% Recall with the gathered dataset. StrongSORT was used as the object tracker, and it was validated with andrology experts due to the unavailability of annotated ground truth data. Furthermore, this research has identified many potential ways for further investigation, and andrology experts can use this system to analyze motility parameters with realistic accuracy.

Keywords: computer vision, deep learning, convolutional neural networks, multi-target tracking, microscopic object detection and tracking, male infertility detection, motility analysis of human spermatozoa

Procedia PDF Downloads 102
24963 Suitability Number of Coarse-Grained Soils and Relationships among Fineness Modulus, Density and Strength Parameters

Authors: Khandaker Fariha Ahmed, Md. Noman Munshi, Tarin Sultana, Md. Zoynul Abedin

Abstract:

Suitability number (SN) is perhaps one of the most important parameters of coarse-grained soil in assessing its appropriateness to use as a backfill in retaining structures, sand compaction pile, Vibro compaction, and other similar foundation and ground improvement works. Though determined in an empirical manner, it is imperative to study SN to understand its relation with other aggregate properties like fineness modulus (FM), and strength and density properties of sandy soil. The present paper reports the findings of the study on the examination of the properties of sandy soil, as mentioned. Random numbers were generated to obtain the percent fineness on various sieve sizes, and fineness modulus and suitability numbers were predicted. Sand samples were collected from the field, and test samples were prepared to determine maximum density, minimum density and shear strength parameter φ against particular fineness modulus and corresponding suitability number Five samples of SN value of excellent (0-10) and three samples of SN value fair (20-30) were taken and relevant tests were done. The data obtained from the laboratory tests were statistically analyzed. Results show that with the increase of SN, the value of FM decreases. Within the SN value rated as excellent (0-10), there is a decreasing trend of φ for a higher value of SN. It is found that SN is dependent on various combinations of grain size properties like D10, D30, and D20, D50. Strong linear relationships were obtained between SN and FM (R²=.0.93) and between SN value and φ (R²=.94). Correlation equations are proposed to define relationships among SN, φ, and FM.

Keywords: density, fineness modulus, shear strength parameter, suitability number

Procedia PDF Downloads 101
24962 Protecting Privacy and Data Security in Online Business

Authors: Bilquis Ferdousi

Abstract:

With the exponential growth of the online business, the threat to consumers’ privacy and data security has become a serious challenge. This literature review-based study focuses on a better understanding of those threats and what legislative measures have been taken to address those challenges. Research shows that people are increasingly involved in online business using different digital devices and platforms, although this practice varies based on age groups. The threat to consumers’ privacy and data security is a serious hindrance in developing trust among consumers in online businesses. There are some legislative measures taken at the federal and state level to protect consumers’ privacy and data security. The study was based on an extensive review of current literature on protecting consumers’ privacy and data security and legislative measures that have been taken.

Keywords: privacy, data security, legislation, online business

Procedia PDF Downloads 98