Search results for: big data types. big data ecosystem
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 28366

Search results for: big data types. big data ecosystem

28126 Epidemiology of Congenital Heart Defects in Kazakhstan: Data from Unified National Electronic Healthcare System 2014-2020

Authors: Dmitriy Syssoyev, Aslan Seitkamzin, Natalya Lim, Kamilla Mussina, Abduzhappar Gaipov, Dimitri Poddighe, Dinara Galiyeva

Abstract:

Background: Data on the epidemiology of congenital heart defects (CHD) in Kazakhstan is scarce. Therefore, the aim of this study was to describe the incidence, prevalence and all-cause mortality of patients with CHD in Kazakhstan, using national large-scale registry data from the Unified National Electronic Healthcare System (UNEHS) for the period of 2014-2020. Methods: In this retrospective cohort study, the included data pertained to all patients diagnosed with CHD in Kazakhstan and registered in UNEHS between January 2014 and December 2020. CHD was defined based on International Classification of Diseases 10th Revision (ICD-10) codes Q20-Q26. Incidence, prevalence, and all-cause mortality rates were calculated per 100,000 population. Survival analysis was performed using Cox proportional hazards regression modeling and the Kaplan-Meier method. Results: In total, 66,512 patients were identified. Among them, 59,534 (89.5%) were diagnosed with a single CHD, while 6,978 (10.5%) had more than two CHDs. The median age at diagnosis was 0.08 years (interquartile range (IQR) 0.01 – 0.66) for people with multiple CHD types and 0.39 years (IQR 0.04 – 8.38) for those with a single CHD type. The most common CHD types were atrial septal defect (ASD) and ventricular septal defect (VSD), accounting for 25.8% and 21.2% of single CHD cases, respectively. The most common multiple types of CHD were ASD with VSD (23.4%), ASD with patent ductus arteriosus (PDA) (19.5%), and VSD with PDA (17.7%). The incidence rate of CHD decreased from 64.6 to 47.1 cases per 100,000 population among men and from 68.7 to 42.4 among women. The prevalence rose from 66.1 to 334.1 cases per 100,000 population among men and from 70.8 to 328.7 among women. Mortality rates showed a slight increase from 3.5 to 4.7 deaths per 100,000 in men and from 2.9 to 3.7 in women. Median follow-up was 5.21 years (IQR 2.47 – 11.69). Male sex (HR 1.60, 95% CI 1.45 - 1.77), having multiple CHDs (HR 2.45, 95% CI 2.01 - 2.97), and living in a rural area (HR 1.32, 95% CI 1.19 - 1.47) were associated with a higher risk of all-cause mortality. Conclusion: The incidence of CHD in Kazakhstan has shown a moderate decrease between 2014 and 2020, while prevalence and mortality have increased. Male sex, multiple CHD types, and rural residence were significantly associated with a higher risk of all-cause mortality.

Keywords: congenital heart defects (CHD), epidemiology, incidence, Kazakhstan, mortality, prevalence

Procedia PDF Downloads 43
28125 Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems

Authors: Mais Haj Qasem, Maen M. Al Assaf, Ali Rodan

Abstract:

Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.

Keywords: hybrid storage system, data mining, recurrent neural network, support vector machine

Procedia PDF Downloads 271
28124 Incorporating Spatial Transcriptome Data into Ligand-Receptor Analyses to Discover Regional Activation in Cells

Authors: Eric Bang

Abstract:

Interactions between receptors and ligands are crucial for many essential biological processes, including neurotransmission and metabolism. Ligand-receptor analyses that examine cell behavior and interactions often utilize cell type-specific RNA expressions from single-cell RNA sequencing (scRNA-seq) data. Using CellPhoneDB, a public repository consisting of ligands, receptors, and ligand-receptor interactions, the cell-cell interactions were explored in a specific scRNA-seq dataset from kidney tissue and portrayed the results with dot plots and heat maps. Depending on the type of cell, each ligand-receptor pair was aligned with the interacting cell type and calculated the positori probabilities of these associations, with corresponding P values reflecting average expression values between the triads and their significance. Using single-cell data (sample kidney cell references), genes in the dataset were cross-referenced with ones in the existing CellPhoneDB dataset. For example, a gene such as Pleiotrophin (PTN) present in the single-cell data also needed to be present in the CellPhoneDB dataset. Using the single-cell transcriptomics data via slide-seq and reference data, the CellPhoneDB program defines cell types and plots them in different formats, with the two main ones being dot plots and heat map plots. The dot plot displays derived measures of the cell to cell interaction scores and p values. For the dot plot, each row shows a ligand-receptor pair, and each column shows the two interacting cell types. CellPhoneDB defines interactions and interaction levels from the gene expression level, so since the p-value is on a -log10 scale, the larger dots represent more significant interactions. By performing an interaction analysis, a significant interaction was discovered for myeloid and T-cell ligand-receptor pairs, including those between Secreted Phosphoprotein 1 (SPP1) and Fibronectin 1 (FN1), which is consistent with previous findings. It was proposed that an effective protocol would involve a filtration step where cell types would be filtered out, depending on which ligand-receptor pair is activated in that part of the tissue, as well as the incorporation of the CellPhoneDB data in a streamlined workflow pipeline. The filtration step would be in the form of a Python script that expedites the manual process necessary for dataset filtration. Being in Python allows it to be integrated with the CellPhoneDB dataset for future workflow analysis. The manual process involves filtering cell types based on what ligand/receptor pair is activated in kidney cells. One limitation of this would be the fact that some pairings are activated in multiple cells at a time, so the manual manipulation of the data is reflected prior to analysis. Using the filtration script, accurate sorting is incorporated into the CellPhoneDB database rather than waiting until the output is produced and then subsequently applying spatial data. It was envisioned that this would reveal wherein the cell various ligands and receptors are interacting with different cell types, allowing for easier identification of which cells are being impacted and why, for the purpose of disease treatment. The hope is this new computational method utilizing spatially explicit ligand-receptor association data can be used to uncover previously unknown specific interactions within kidney tissue.

Keywords: bioinformatics, Ligands, kidney tissue, receptors, spatial transcriptome

Procedia PDF Downloads 109
28123 Discussion on Big Data and One of Its Early Training Application

Authors: Fulya Gokalp Yavuz, Mark Daniel Ward

Abstract:

This study focuses on a contemporary and inevitable topic of Data Science and its exemplary application for early career building: Big Data and Leaving Learning Community (LLC). ‘Academia’ and ‘Industry’ have a common sense on the importance of Big Data. However, both of them are in a threat of missing the training on this interdisciplinary area. Some traditional teaching doctrines are far away being effective on Data Science. Practitioners needs some intuition and real-life examples how to apply new methods to data in size of terabytes. We simply explain the scope of Data Science training and exemplified its early stage application with LLC, which is a National Science Foundation (NSF) founded project under the supervision of Prof. Ward since 2014. Essentially, we aim to give some intuition for professors, researchers and practitioners to combine data science tools for comprehensive real-life examples with the guides of mentees’ feedback. As a result of discussing mentoring methods and computational challenges of Big Data, we intend to underline its potential with some more realization.

Keywords: Big Data, computation, mentoring, training

Procedia PDF Downloads 321
28122 How to Perform Proper Indexing?

Authors: Watheq Mansour, Waleed Bin Owais, Mohammad Basheer Kotit, Khaled Khan

Abstract:

Efficient query processing is one of the utmost requisites in any business environment to satisfy consumer needs. This paper investigates the various types of indexing models, viz. primary, secondary, and multi-level. The investigation is done under the ambit of various types of queries to which each indexing model performs with efficacy. This study also discusses the inherent advantages and disadvantages of each indexing model and how indexing models can be chosen based on a particular environment. This paper also draws parallels between various indexing models and provides recommendations that would help a Database administrator to zero-in on a particular indexing model attributed to the needs and requirements of the production environment. In addition, to satisfy industry and consumer needs attributed to the colossal data generation nowadays, this study has proposed two novel indexing techniques that can be used to index highly unstructured and structured Big Data with efficacy. The study also briefly discusses some best practices that the industry should follow in order to choose an indexing model that is apposite to their prerequisites and requirements.

Keywords: indexing, hashing, latent semantic indexing, B-tree

Procedia PDF Downloads 118
28121 Ontological Modeling Approach for Statistical Databases Publication in Linked Open Data

Authors: Bourama Mane, Ibrahima Fall, Mamadou Samba Camara, Alassane Bah

Abstract:

At the level of the National Statistical Institutes, there is a large volume of data which is generally in a format which conditions the method of publication of the information they contain. Each household or business data collection project includes a dissemination platform for its implementation. Thus, these dissemination methods previously used, do not promote rapid access to information and especially does not offer the option of being able to link data for in-depth processing. In this paper, we present an approach to modeling these data to publish them in a format intended for the Semantic Web. Our objective is to be able to publish all this data in a single platform and offer the option to link with other external data sources. An application of the approach will be made on data from major national surveys such as the one on employment, poverty, child labor and the general census of the population of Senegal.

Keywords: Semantic Web, linked open data, database, statistic

Procedia PDF Downloads 146
28120 A Hybrid Feature Selection and Deep Learning Algorithm for Cancer Disease Classification

Authors: Niousha Bagheri Khulenjani, Mohammad Saniee Abadeh

Abstract:

Learning from very big datasets is a significant problem for most present data mining and machine learning algorithms. MicroRNA (miRNA) is one of the important big genomic and non-coding datasets presenting the genome sequences. In this paper, a hybrid method for the classification of the miRNA data is proposed. Due to the variety of cancers and high number of genes, analyzing the miRNA dataset has been a challenging problem for researchers. The number of features corresponding to the number of samples is high and the data suffer from being imbalanced. The feature selection method has been used to select features having more ability to distinguish classes and eliminating obscures features. Afterward, a Convolutional Neural Network (CNN) classifier for classification of cancer types is utilized, which employs a Genetic Algorithm to highlight optimized hyper-parameters of CNN. In order to make the process of classification by CNN faster, Graphics Processing Unit (GPU) is recommended for calculating the mathematic equation in a parallel way. The proposed method is tested on a real-world dataset with 8,129 patients, 29 different types of tumors, and 1,046 miRNA biomarkers, taken from The Cancer Genome Atlas (TCGA) database.

Keywords: cancer classification, feature selection, deep learning, genetic algorithm

Procedia PDF Downloads 82
28119 Residents’ Awareness of Green Infrastructure Types in the Neighbourhood: Panacea for Biodiversity Conservation

Authors: Adedotun Ayodele Dipeolu, Olusegun Ayotunde Oriola

Abstract:

Rapid urban growth has led to the loss of contact with nature for most urban residents. While Green Infrastructure (GI) is promoted as a strategy to manage ecosystems’ functionality, the extent to which residents are aware of GI types which serve as alternatives to conventional landscapes to be conserved remains unclear. This paper examines the awareness level of GI types among residents of Lagos Metropolis, Nigeria and the association of their demographic characteristics with the level of awareness. Multi-stage sampling technique was used to select 1560 residents who completed semi-structured questionnaires. Descriptive statistics were used to explore data distributions while t-test assessed the differences in the awareness level of the male and female participants. From the 23 different types of GI facilities identified in the study area, residents reported a high level of awareness on just five of them. These include green gardens, green parks, grasses, street trees, and sports fields but a low level of awareness of the remaining 18 GI types. Awareness of GI types is presently low in the study area. Increased awareness will encourage care and protection of green infrastructure by residents which will consequently enhance availability and conservation of more biodiversity in Lagos, Nigeria, and other nations.

Keywords: awareness, biodiversity conservation, environmental sustainability, green infrastructure, urban centres

Procedia PDF Downloads 171
28118 The Role of Data Protection Officer in Managing Individual Data: Issues and Challenges

Authors: Nazura Abdul Manap, Siti Nur Farah Atiqah Salleh

Abstract:

For decades, the misuse of personal data has been a critical issue. Malaysia has accepted responsibility by implementing the Malaysian Personal Data Protection Act 2010 to secure personal data (PDPA 2010). After more than a decade, this legislation is set to be revised by the current PDPA 2023 Amendment Bill to align with the world's key personal data protection regulations, such as the European Union General Data Protection Regulations (GDPR). Among the other suggested adjustments is the Data User's appointment of a Data Protection Officer (DPO) to ensure the commercial entity's compliance with the PDPA 2010 criteria. The change is expected to be enacted in parliament fairly soon; nevertheless, based on the experience of the Personal Data Protection Department (PDPD) in implementing the Act, it is projected that there will be a slew of additional concerns associated with the DPO mandate. Consequently, the goal of this article is to highlight the issues that the DPO will encounter and how the Personal Data Protection Department should respond to this subject. The study result was produced using a qualitative technique based on an examination of the current literature. This research reveals that there are probable obstacles experienced by the DPO, and thus, there should be a definite, clear guideline in place to aid DPO in executing their tasks. It is argued that appointing a DPO is a wise measure in ensuring that the legal data security requirements are met.

Keywords: guideline, law, data protection officer, personal data

Procedia PDF Downloads 46
28117 Data Collection Based on the Questionnaire Survey In-Hospital Emergencies

Authors: Nouha Mhimdi, Wahiba Ben Abdessalem Karaa, Henda Ben Ghezala

Abstract:

The methods identified in data collection are diverse: electronic media, focus group interviews and short-answer questionnaires [1]. The collection of poor-quality data resulting, for example, from poorly designed questionnaires, the absence of good translators or interpreters, and the incorrect recording of data allow conclusions to be drawn that are not supported by the data or to focus only on the average effect of the program or policy. There are several solutions to avoid or minimize the most frequent errors, including obtaining expert advice on the design or adaptation of data collection instruments; or use technologies allowing better "anonymity" in the responses [2]. In this context, we opted to collect good quality data by doing a sizeable questionnaire-based survey on hospital emergencies to improve emergency services and alleviate the problems encountered. At the level of this paper, we will present our study, and we will detail the steps followed to achieve the collection of relevant, consistent and practical data.

Keywords: data collection, survey, questionnaire, database, data analysis, hospital emergencies

Procedia PDF Downloads 76
28116 Soil Surface Insect Diversity of Tobacco Agricultural Ecosystem in Imogiri, Bantul District of Yogyakarta Special Region, Indonesia

Authors: Martina Faika Harianja, Zahtamal, Indah Nuraini, Septi Mutia Handayani, R. C. Hidayat Soesilohadi

Abstract:

Tobacco is a valuable commodity that supports economic growth in Indonesia. Soil surface insects are important components that influence productivity of tobacco. Thus, diversity of soil surface insects needs to be studied in order to acquire information about specific roles of each species in ecosystem. This research aimed to study the soil surface insect diversity of tobacco agricultural ecosystem in Imogiri, Bantul District of Yogyakarta Special Region, Indonesia. Samples were collected by pitfall-sugar bait trap in August 2015. Result showed 5 orders, 8 families, and 17 genera of soil surface insects were found. The diversity category of soil surface insects in tobacco agricultural ecosystem was poor. Dominant genus was Monomorium with dominance index score 0.07588. Percentages of insects’ roles were omnivores 43%, detritivores 24%, predators 19%, and herbivores 14%.

Keywords: diversity, Indonesia, soil surface insect, tobacco

Procedia PDF Downloads 298
28115 Building Energy Modeling for Networks of Data Centers

Authors: Eric Kumar, Erica Cochran, Zhiang Zhang, Wei Liang, Ronak Mody

Abstract:

The objective of this article was to create a modelling framework that exposes the marginal costs of shifting workloads across geographically distributed data-centers. Geographical distribution of internet services helps to optimize their performance for localized end users with lowered communications times and increased availability. However, due to the geographical and temporal effects, the physical embodiments of a service's data center infrastructure can vary greatly. In this work, we first identify that the sources of variances in the physical infrastructure primarily stem from local weather conditions, specific user traffic profiles, energy sources, and the types of IT hardware available at the time of deployment. Second, we create a traffic simulator that indicates the IT load at each data-center in the set as an approximator for user traffic profiles. Third, we implement a framework that quantifies the global level energy demands using building energy models and the traffic profiles. The results of the model provide a time series of energy demands that can be used for further life cycle analysis of internet services.

Keywords: data-centers, energy, life cycle, network simulation

Procedia PDF Downloads 115
28114 Federated Learning in Healthcare

Authors: Ananya Gangavarapu

Abstract:

Convolutional Neural Networks (CNN) based models are providing diagnostic capabilities on par with the medical specialists in many specialty areas. However, collecting the medical data for training purposes is very challenging because of the increased regulations around data collections and privacy concerns around personal health data. The gathering of the data becomes even more difficult if the capture devices are edge-based mobile devices (like smartphones) with feeble wireless connectivity in rural/remote areas. In this paper, I would like to highlight Federated Learning approach to mitigate data privacy and security issues.

Keywords: deep learning in healthcare, data privacy, federated learning, training in distributed environment

Procedia PDF Downloads 106
28113 A Survey in Techniques for Imbalanced Intrusion Detection System Datasets

Authors: Najmeh Abedzadeh, Matthew Jacobs

Abstract:

An intrusion detection system (IDS) is a software application that monitors malicious activities and generates alerts if any are detected. However, most network activities in IDS datasets are normal, and the relatively few numbers of attacks make the available data imbalanced. Consequently, cyber-attacks can hide inside a large number of normal activities, and machine learning algorithms have difficulty learning and classifying the data correctly. In this paper, a comprehensive literature review is conducted on different types of algorithms for both implementing the IDS and methods in correcting the imbalanced IDS dataset. The most famous algorithms are machine learning (ML), deep learning (DL), synthetic minority over-sampling technique (SMOTE), and reinforcement learning (RL). Most of the research use the CSE-CIC-IDS2017, CSE-CIC-IDS2018, and NSL-KDD datasets for evaluating their algorithms.

Keywords: IDS, imbalanced datasets, sampling algorithms, big data

Procedia PDF Downloads 273
28112 The Utilization of Big Data in Knowledge Management Creation

Authors: Daniel Brian Thompson, Subarmaniam Kannan

Abstract:

The huge weightage of knowledge in this world and within the repository of organizations has already reached immense capacity and is constantly increasing as time goes by. To accommodate these constraints, Big Data implementation and algorithms are utilized to obtain new or enhanced knowledge for decision-making. With the transition from data to knowledge provides the transformational changes which will provide tangible benefits to the individual implementing these practices. Today, various organization would derive knowledge from observations and intuitions where this information or data will be translated into best practices for knowledge acquisition, generation and sharing. Through the widespread usage of Big Data, the main intention is to provide information that has been cleaned and analyzed to nurture tangible insights for an organization to apply to their knowledge-creation practices based on facts and figures. The translation of data into knowledge will generate value for an organization to make decisive decisions to proceed with the transition of best practices. Without a strong foundation of knowledge and Big Data, businesses are not able to grow and be enhanced within the competitive environment.

Keywords: big data, knowledge management, data driven, knowledge creation

Procedia PDF Downloads 73
28111 Study of Phenotypic Polymorphism and Detection of Genotypic Polymorphism in Menochilus sexmaculatus (Coleoptera: Insecta) Using RAPD PCR

Authors: Huma Balouch

Abstract:

Menochilus sexmaculatus commonly known as six spotted zig zag ladybird, is an aphidophagus and the most misidentified Coccinellids due to the occurrence of numerous color variants. The correct identification of Menochilus sexmaculatus and its strains is necessary to implement the use of biological control. In the present study phenotypic and genotypic polymorphism was investigated in Menochilus sexmaculatus collected from Punjab, NWFP and Sindh provinces of Pakistan. Six different morphs of the species were distinguished by analyzing its Elytral color and spot pattern and then Polymerase Chain Reaction was used to generate random amplification of polymorphic DNA (RAPD) from six different types of Menochilus sexmaculatus. Forty primers (OPA & OPC Kit) were used to perform RAPD PCR on six different types of Menochilus sexmaculatus of which, seven primers revealed different patterns related to the Menochilus sexmaculatus types. These seven primers (OPA-04, OPA-09, OPA-18, OPC-04, OPC-12, OPC-15 and OPC-18) produced 111 clear polymorphic bands and 6 scorable strain specific markers. The cluster analysis applied to RAPD data showed high polymorphism among six types and it can be concluded that these six types are six polymorphic strains of the same species.

Keywords: Menochilus sexmaculatus, aphidophagus, coccinellids, phenotypic and genotypic polymorphism, RAPD-PCR, strain specific markers

Procedia PDF Downloads 454
28110 Survey on Data Security Issues Through Cloud Computing Amongst Sme’s in Nairobi County, Kenya

Authors: Masese Chuma Benard, Martin Onsiro Ronald

Abstract:

Businesses have been using cloud computing more frequently recently because they wish to take advantage of its advantages. However, employing cloud computing also introduces new security concerns, particularly with regard to data security, potential risks and weaknesses that could be exploited by attackers, and various tactics and strategies that could be used to lessen these risks. This study examines data security issues on cloud computing amongst sme’s in Nairobi county, Kenya. The study used the sample size of 48, the research approach was mixed methods, The findings show that data owner has no control over the cloud merchant's data management procedures, there is no way to ensure that data is handled legally. This implies that you will lose control over the data stored in the cloud. Data and information stored in the cloud may face a range of availability issues due to internet outages; this can represent a significant risk to data kept in shared clouds. Integrity, availability, and secrecy are all mentioned.

Keywords: data security, cloud computing, information, information security, small and medium-sized firms (SMEs)

Procedia PDF Downloads 50
28109 Cloud Design for Storing Large Amount of Data

Authors: M. Strémy, P. Závacký, P. Cuninka, M. Juhás

Abstract:

Main goal of this paper is to introduce our design of private cloud for storing large amount of data, especially pictures, and to provide good technological backend for data analysis based on parallel processing and business intelligence. We have tested hypervisors, cloud management tools, storage for storing all data and Hadoop to provide data analysis on unstructured data. Providing high availability, virtual network management, logical separation of projects and also rapid deployment of physical servers to our environment was also needed.

Keywords: cloud, glusterfs, hadoop, juju, kvm, maas, openstack, virtualization

Procedia PDF Downloads 322
28108 Estimation of Missing Values in Aggregate Level Spatial Data

Authors: Amitha Puranik, V. S. Binu, Seena Biju

Abstract:

Missing data is a common problem in spatial analysis especially at the aggregate level. Missing can either occur in covariate or in response variable or in both in a given location. Many missing data techniques are available to estimate the missing data values but not all of these methods can be applied on spatial data since the data are autocorrelated. Hence there is a need to develop a method that estimates the missing values in both response variable and covariates in spatial data by taking account of the spatial autocorrelation. The present study aims to develop a model to estimate the missing data points at the aggregate level in spatial data by accounting for (a) Spatial autocorrelation of the response variable (b) Spatial autocorrelation of covariates and (c) Correlation between covariates and the response variable. Estimating the missing values of spatial data requires a model that explicitly account for the spatial autocorrelation. The proposed model not only accounts for spatial autocorrelation but also utilizes the correlation that exists between covariates, within covariates and between a response variable and covariates. The precise estimation of the missing data points in spatial data will result in an increased precision of the estimated effects of independent variables on the response variable in spatial regression analysis.

Keywords: spatial regression, missing data estimation, spatial autocorrelation, simulation analysis

Procedia PDF Downloads 341
28107 Establishment of Landslide Warning System Using Surface or Sub-Surface Sensors Data

Authors: Neetu Tyagi, Sumit Sharma

Abstract:

The study illustrates the results of an integrated study done on Tangni landslide located on NH-58 at Chamoli, Uttarakhand. Geological, geo-morphological and geotechnical investigations were carried out to understand the mechanism of landslide and to plan further investigation and monitoring. At any rate, the movements were favored by continuous rainfall water infiltration from the zones where the phyllites/slates and Dolomites outcrop. The site investigations were carried out including the monitoring of landslide movements and of the water level fluctuations due to rainfall give us a better understanding of landslide dynamics that have been causing in time soil instability at Tangni landslide site. The Early Warning System (EWS) installed different types of sensors and all sensors were directly connected to data logger and raw data transfer to the Defence Terrain Research Laboratory (DTRL) server room with the help of File Transfer Protocol (FTP). The slip surfaces were found at depths ranging from 8 to 10 m from Geophysical survey and hence sensors were installed to the depth of 15m at various locations of landslide. Rainfall is the main triggering factor of landslide. In this study, the developed model of unsaturated soil slope stability is carried out. The analysis of sensors data available for one year, indicated the sliding surface of landslide at depth between 6 to 12m with total displacement up to 6cm per year recorded at the body of landslide. The aim of this study is to set the threshold and generate early warning. Local peoples already alert towards landslide, if they have any types of warning system.

Keywords: early warning system, file transfer protocol, geo-morphological, geotechnical, landslide

Procedia PDF Downloads 124
28106 Environmental Impacts and Ecological Utilization of Water Hyacinth (Eichhornia crassipes) in the Niger Delta Fresh Ecosystem

Authors: Seiyaboh E. I.

Abstract:

Water Hyacinth (Eichhornia crassipes) was introduced into many parts of the world, including Africa, as an ornamental garden pond plant because of its beauty. However, it is considered a dangerous pest today because when not controlled, water hyacinth will cover rivers, lakes and ponds entirely; this dramatically impacts water flow, blocks sunlight from reaching native aquatic plants, and starves the water of oxygen, often killing fish and other aquatic organisms. In the Niger Delta region, water hyacinth is considered a nuisance because of its very obvious devastating environmental impacts in the region. However, water hyacinth (Eichhornia crassipes) constitutes a very important part of an aquatic ecosystem. It possesses specialized growth habits, physiological characteristics and reproductive strategies that allow for rapid growth and spread in freshwater environments and this explains its very rapid spread in the Niger Delta freshwater ecosystem. This paper therefore focuses on the environmental consequences of the proliferation of water hyacinth (Eichhornia crassipes) in the Niger Delta freshwater ecosystem, extent of impact, and options available for its ecological utilization which will help mitigate proliferation, restore effective freshwater ecosystem utilization and balance. It concludes by recommending sustainable practices outlining the beneficial uses of water hyacinth (Eichhornia crassipes) rather than control.

Keywords: environmental impacts, ecological utilization, Niger Delta, water hyacinth, Eichhornia crassipes

Procedia PDF Downloads 237
28105 Valuing Cultural Ecosystem Services of Natural Treatment Systems Using Crowdsourced Data

Authors: Andrea Ghermandi

Abstract:

Natural treatment systems such as constructed wetlands and waste stabilization ponds are increasingly used to treat water and wastewater from a variety of sources, including stormwater and polluted surface water. The provision of ancillary benefits in the form of cultural ecosystem services makes these systems unique among water and wastewater treatment technologies and greatly contributes to determine their potential role in promoting sustainable water management practices. A quantitative analysis of these benefits, however, has been lacking in the literature. Here, a critical assessment of the recreational and educational benefits in natural treatment systems is provided, which combines observed public use from a survey of managers and operators with estimated public use as obtained using geotagged photos from social media as a proxy for visitation rates. Geographic Information Systems (GIS) are used to characterize the spatial boundaries of 273 natural treatment systems worldwide. Such boundaries are used as input for the Application Program Interfaces (APIs) of two popular photo-sharing websites (Flickr and Panoramio) in order to derive the number of photo-user-days, i.e., the number of yearly visits by individual photo users in each site. The adequateness and predictive power of four univariate calibration models using the crowdsourced data as a proxy for visitation are evaluated. A high correlation is found between photo-user-days and observed annual visitors (Pearson's r = 0.811; p-value < 0.001; N = 62). Standardized Major Axis (SMA) regression is found to outperform Ordinary Least Squares regression and count data models in terms of predictive power insofar as standard verification statistics – such as the root mean square error of prediction (RMSEP), the mean absolute error of prediction (MAEP), the reduction of error (RE), and the coefficient of efficiency (CE) – are concerned. The SMA regression model is used to estimate the intensity of public use in all 273 natural treatment systems. System type, influent water quality, and area are found to statistically affect public use, consistently with a priori expectations. Publicly available information regarding the home location of the sampled visitors is derived from their social media profiles and used to infer the distance they are willing to travel to visit the natural treatment systems in the database. Such information is analyzed using the travel cost method to derive monetary estimates of the recreational benefits of the investigated natural treatment systems. Overall, the findings confirm the opportunities arising from an integrated design and management of natural treatment systems, which combines the objectives of water quality enhancement and provision of cultural ecosystem services through public use in a multi-functional approach and compatibly with the need to protect public health.

Keywords: constructed wetlands, cultural ecosystem services, ecological engineering, waste stabilization ponds

Procedia PDF Downloads 149
28104 Association Rules Mining and NOSQL Oriented Document in Big Data

Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub

Abstract:

Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.

Keywords: Apriori, Association rules mining, Big Data, Data Mining, Hadoop, MapReduce, MongoDB, NoSQL

Procedia PDF Downloads 123
28103 Multimodal Deep Learning for Human Activity Recognition

Authors: Ons Slimene, Aroua Taamallah, Maha Khemaja

Abstract:

In recent years, human activity recognition (HAR) has been a key area of research due to its diverse applications. It has garnered increasing attention in the field of computer vision. HAR plays an important role in people’s daily lives as it has the ability to learn advanced knowledge about human activities from data. In HAR, activities are usually represented by exploiting different types of sensors, such as embedded sensors or visual sensors. However, these sensors have limitations, such as local obstacles, image-related obstacles, sensor unreliability, and consumer concerns. Recently, several deep learning-based approaches have been proposed for HAR and these approaches are classified into two categories based on the type of data used: vision-based approaches and sensor-based approaches. This research paper highlights the importance of multimodal data fusion from skeleton data obtained from videos and data generated by embedded sensors using deep neural networks for achieving HAR. We propose a deep multimodal fusion network based on a twostream architecture. These two streams use the Convolutional Neural Network combined with the Bidirectional LSTM (CNN BILSTM) to process skeleton data and data generated by embedded sensors and the fusion at the feature level is considered. The proposed model was evaluated on a public OPPORTUNITY++ dataset and produced a accuracy of 96.77%.

Keywords: human activity recognition, action recognition, sensors, vision, human-centric sensing, deep learning, context-awareness

Procedia PDF Downloads 59
28102 Immunization-Data-Quality in Public Health Facilities in the Pastoralist Communities: A Comparative Study Evidence from Afar and Somali Regional States, Ethiopia

Authors: Melaku Tsehay

Abstract:

The Consortium of Christian Relief and Development Associations (CCRDA), and the CORE Group Polio Partners (CGPP) Secretariat have been working with Global Alliance for Vac-cines and Immunization (GAVI) to improve the immunization data quality in Afar and Somali Regional States. The main aim of this study was to compare the quality of immunization data before and after the above interventions in health facilities in the pastoralist communities in Ethiopia. To this end, a comparative-cross-sectional study was conducted on 51 health facilities. The baseline data was collected in May 2019, while the end line data in August 2021. The WHO data quality self-assessment tool (DQS) was used to collect data. A significant improvment was seen in the accuracy of the pentavalent vaccine (PT)1 (p = 0.012) data at the health posts (HP), while PT3 (p = 0.010), and Measles (p = 0.020) at the health centers (HC). Besides, a highly sig-nificant improvment was observed in the accuracy of tetanus toxoid (TT)2 data at HP (p < 0.001). The level of over- or under-reporting was found to be < 8%, at the HP, and < 10% at the HC for PT3. The data completeness was also increased from 72.09% to 88.89% at the HC. Nearly 74% of the health facilities timely reported their respective immunization data, which is much better than the baseline (7.1%) (p < 0.001). These findings may provide some hints for the policies and pro-grams targetting on improving immunization data qaulity in the pastoralist communities.

Keywords: data quality, immunization, verification factor, pastoralist region

Procedia PDF Downloads 60
28101 Ecosystem Services and Excess Water Management: Analysis of Ecosystem Services in Areas Exposed to Excess Water Inundation

Authors: Dalma Varga, Nora Hubayne H.

Abstract:

Nowadays, among the measures taken to offset the consequences of climate change, water resources management is one of the key tools, which can include excess water management. As a result of climate change’s effects and as a result of the frequent inappropriate landuse, more and more areas are affected by the excess water inundation. Hungary is located in the deepest part of the Pannonian Basin, which is exposed to water damage – especially lowland areas that are endangered by floods or excess waters. The periodical presence of excess water creates specific habitats in a given area, which have ecological, functional, and aesthetic values. Excess water inundation affects approximately 74% of Hungary’s lowland areas, of which about 46% is also under nature protection (such as national parks, protected landscape areas, nature conservation areas, Natura 2000 sites, etc.). These data prove that areas exposed to excess water inundation – which are predominantly characterized by agricultural land uses – have an important ecological role. Other research works have confirmed the presence of numerous rare and endangered plant species in drainage canals, on grasslands exposed to excess water, and on special agricultural fields with mud vegetation. The goal of this research is to define and analyze ecosystem services of areas exposed to excess water inundation. In addition to this, it is also important to determine the quantified indicators of these areas’ natural and landscape values besides the presence of protected species and the naturalness of habitats, so all in all, to analyze the various nature protections related to excess water. As a result, a practice-orientated assessment method has been developed that provides the ecological water demand, assimilates to ecological and habitat aspects, contributes to adaptive excess water management, and last but not least, increases or maintains the share of the green infrastructure network. In this way, it also contributes to reduce and mitigate the negative effects of climate change.

Keywords: ecosystem services, landscape architecture, excess water management, green infrastructure planning

Procedia PDF Downloads 280
28100 Identifying Critical Success Factors for Data Quality Management through a Delphi Study

Authors: Maria Paula Santos, Ana Lucas

Abstract:

Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.

Keywords: critical success factors, data quality, data quality management, Delphi, Q-Sort

Procedia PDF Downloads 182
28099 Development and Investigation of Sustainable Wireless Sensor Networks for forest Ecosystems

Authors: Shathya Duobiene, Gediminas Račiukaitis

Abstract:

Solar-powered wireless sensor nodes work best when they operate continuously with minimal energy consumption. Wireless Sensor Networks (WSNs) are a new technology opens up wide studies, and advancements are expanding the prevalence of numerous monitoring applications and real-time aid for environments. The Selective Surface Activation Induced by Laser (SSAIL) technology is an exciting development that gives the design of WSNs more flexibility in terms of their shape, dimensions, and materials. This research work proposes a methodology for using SSAIL technology for forest ecosystem monitoring by wireless sensor networks. WSN monitoring the temperature and humidity were deployed, and their architectures are discussed. The paper presents the experimental outcomes of deploying newly built sensor nodes in forested areas. Finally, a practical method is offered to extend the WSN's lifespan and ensure its continued operation. When operational, the node is independent of the base station's power supply and uses only as much energy as necessary to sense and transmit data.

Keywords: internet of things (IoT), wireless sensor network, sensor nodes, SSAIL technology, forest ecosystem

Procedia PDF Downloads 41
28098 Scientific Linux Cluster for BIG-DATA Analysis (SLBD): A Case of Fayoum University

Authors: Hassan S. Hussein, Rania A. Abul Seoud, Amr M. Refaat

Abstract:

Scientific researchers face in the analysis of very large data sets that is increasing noticeable rate in today’s and tomorrow’s technologies. Hadoop and Spark are types of software that developed frameworks. Hadoop framework is suitable for many Different hardware platforms. In this research, a scientific Linux cluster for Big Data analysis (SLBD) is presented. SLBD runs open source software with large computational capacity and high performance cluster infrastructure. SLBD composed of one cluster contains identical, commodity-grade computers interconnected via a small LAN. SLBD consists of a fast switch and Gigabit-Ethernet card which connect four (nodes). Cloudera Manager is used to configure and manage an Apache Hadoop stack. Hadoop is a framework allows storing and processing big data across the cluster by using MapReduce algorithm. MapReduce algorithm divides the task into smaller tasks which to be assigned to the network nodes. Algorithm then collects the results and form the final result dataset. SLBD clustering system allows fast and efficient processing of large amount of data resulting from different applications. SLBD also provides high performance, high throughput, high availability, expandability and cluster scalability.

Keywords: big data platforms, cloudera manager, Hadoop, MapReduce

Procedia PDF Downloads 325
28097 The Effect of Multiple Environmental Conditions on Acacia senegal Seedling’s Carbon, Nitrogen, and Hydrogen Contents: An Experimental Investigation

Authors: Abdelmoniem A. Attaelmanan, Ahmed A. H. Siddig

Abstract:

This study was conducted in light of continual global climate changes that projected increasing aridity, changes in soil fertility, and pollution. Plant growth and development largely depend on the combination of availing water and nutrients in the soil. Changes in the climate and atmospheric chemistry can cause serious effects on these growth factors. Plant carbon (C), nitrogen (N), and hydrogen (H) play a fundamental role in the maintenance of ecosystem structure and function. Hashab (Acacia senegal), which produces gum Arabic, supports dryland ecosystems in tropical zones by its potentiality to restore degraded soils; hence it is ecologically and economically important for the dry areas of sub-Saharan Africa. The study aims at investigating the effects of water stress (simulated drought) and poor soil type on Acacia senegal C, N, and H contents. Seven days old seedlings were assigned to the treatments in Split- plot design for four weeks. The main plot is irrigation interval (well-watered and water-stressed), and the subplot is soil types (silt and sand soils). Seedling's C%, N%, and H% were measured using CHNS-O Analyzer and applying Standard Test Method. Irrigation intervals and soil types had no effects on seedlings and leaves C%, N%, and H%, irrigation interval had affected stem C and H%, both irrigation intervals and soil types had affected root N% and interaction effect of water and soil was found on leaves and root's N%. Synthesis application of well-watered irrigation with soil that is rich in N and other nutrients would result in the greatest seedling C, N, and H content which will enhance growth and biomass accumulation and can play a crucial role in ecosystem productivity and services in the dryland regions.

Keywords: Acacia senegal, Africa, climate change, drylands, nutrients biomass, Sub-Saharan, Sudan

Procedia PDF Downloads 77