Search results for: big data ecosystem
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25188

Search results for: big data ecosystem

24498 Data Science-Based Key Factor Analysis and Risk Prediction of Diabetic

Authors: Fei Gao, Rodolfo C. Raga Jr.

Abstract:

This research proposal will ascertain the major risk factors for diabetes and to design a predictive model for risk assessment. The project aims to improve diabetes early detection and management by utilizing data science techniques, which may improve patient outcomes and healthcare efficiency. The phase relation values of each attribute were used to analyze and choose the attributes that might influence the examiner's survival probability using Diabetes Health Indicators Dataset from Kaggle’s data as the research data. We compare and evaluate eight machine learning algorithms. Our investigation begins with comprehensive data preprocessing, including feature engineering and dimensionality reduction, aimed at enhancing data quality. The dataset, comprising health indicators and medical data, serves as a foundation for training and testing these algorithms. A rigorous cross-validation process is applied, and we assess their performance using five key metrics like accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). After analyzing the data characteristics, investigate their impact on the likelihood of diabetes and develop corresponding risk indicators.

Keywords: diabetes, risk factors, predictive model, risk assessment, data science techniques, early detection, data analysis, Kaggle

Procedia PDF Downloads 58
24497 A Methodology to Integrate Data in the Company Based on the Semantic Standard in the Context of Industry 4.0

Authors: Chang Qin, Daham Mustafa, Abderrahmane Khiat, Pierre Bienert, Paulo Zanini

Abstract:

Nowadays, companies are facing lots of challenges in the process of digital transformation, which can be a complex and costly undertaking. Digital transformation involves the collection and analysis of large amounts of data, which can create challenges around data management and governance. Furthermore, it is also challenged to integrate data from multiple systems and technologies. Although with these pains, companies are still pursuing digitalization because by embracing advanced technologies, companies can improve efficiency, quality, decision-making, and customer experience while also creating different business models and revenue streams. In this paper, the issue that data is stored in data silos with different schema and structures is focused. The conventional approaches to addressing this issue involve utilizing data warehousing, data integration tools, data standardization, and business intelligence tools. However, these approaches primarily focus on the grammar and structure of the data and neglect the importance of semantic modeling and semantic standardization, which are essential for achieving data interoperability. In this session, the challenge of data silos in Industry 4.0 is addressed by developing a semantic modeling approach compliant with Asset Administration Shell (AAS) models as an efficient standard for communication in Industry 4.0. The paper highlights how our approach can facilitate the data mapping process and semantic lifting according to existing industry standards such as ECLASS and other industrial dictionaries. It also incorporates the Asset Administration Shell technology to model and map the company’s data and utilize a knowledge graph for data storage and exploration.

Keywords: data interoperability in industry 4.0, digital integration, industrial dictionary, semantic modeling

Procedia PDF Downloads 81
24496 Big Data Analytics and Data Security in the Cloud via Fully Homomorphic Encryption

Authors: Waziri Victor Onomza, John K. Alhassan, Idris Ismaila, Noel Dogonyaro Moses

Abstract:

This paper describes the problem of building secure computational services for encrypted information in the Cloud Computing without decrypting the encrypted data; therefore, it meets the yearning of computational encryption algorithmic aspiration model that could enhance the security of big data for privacy, confidentiality, availability of the users. The cryptographic model applied for the computational process of the encrypted data is the Fully Homomorphic Encryption Scheme. We contribute theoretical presentations in high-level computational processes that are based on number theory and algebra that can easily be integrated and leveraged in the Cloud computing with detail theoretic mathematical concepts to the fully homomorphic encryption models. This contribution enhances the full implementation of big data analytics based cryptographic security algorithm.

Keywords: big data analytics, security, privacy, bootstrapping, homomorphic, homomorphic encryption scheme

Procedia PDF Downloads 368
24495 Protecting Privacy and Data Security in Online Business

Authors: Bilquis Ferdousi

Abstract:

With the exponential growth of the online business, the threat to consumers’ privacy and data security has become a serious challenge. This literature review-based study focuses on a better understanding of those threats and what legislative measures have been taken to address those challenges. Research shows that people are increasingly involved in online business using different digital devices and platforms, although this practice varies based on age groups. The threat to consumers’ privacy and data security is a serious hindrance in developing trust among consumers in online businesses. There are some legislative measures taken at the federal and state level to protect consumers’ privacy and data security. The study was based on an extensive review of current literature on protecting consumers’ privacy and data security and legislative measures that have been taken.

Keywords: privacy, data security, legislation, online business

Procedia PDF Downloads 91
24494 Flowing Online Vehicle GPS Data Clustering Using a New Parallel K-Means Algorithm

Authors: Orhun Vural, Oguz Bayat, Rustu Akay, Osman N. Ucan

Abstract:

This study presents a new parallel approach clustering of GPS data. Evaluation has been made by comparing execution time of various clustering algorithms on GPS data. This paper aims to propose a parallel based on neighborhood K-means algorithm to make it faster. The proposed parallelization approach assumes that each GPS data represents a vehicle and to communicate between vehicles close to each other after vehicles are clustered. This parallelization approach has been examined on different sized continuously changing GPS data and compared with serial K-means algorithm and other serial clustering algorithms. The results demonstrated that proposed parallel K-means algorithm has been shown to work much faster than other clustering algorithms.

Keywords: parallel k-means algorithm, parallel clustering, clustering algorithms, clustering on flowing data

Procedia PDF Downloads 210
24493 Understanding Governance of Biodiversity-Supporting and Edible Landscapes Using Network Analysis in a Fast Urbanising City of South India

Authors: M. Soubadra Devy, Savitha Swamy, Chethana V. Casiker

Abstract:

Sustainable smart cities are emerging as an important concept in response to the exponential rise in the world’s urbanizing population. While earlier, only technical, economic and governance based solutions were considered, more and more layers are being added in recent times. With the prefix of 'sustainability', solutions which help in judicious use of resources without negatively impacting the environment have become critical. We present a case study of Bangalore city which has transformed from being a garden city and pensioners' paradise to being an IT city with a huge, young population from different regions and diverse cultural backgrounds. This has had a big impact on the green spaces in the city and the biodiversity that they support, as well as on farming/gardening practices. Edible landscapes comprising farms lands, home gardens and neighbourhood parks (NPs henceforth) were examined. The land prices of areas having NPs were higher than those that did not indicate an appreciation of their aesthetic value. NPs were part of old and new residential areas largely managed by the municipality. They comprised manicured gardens which were similar in vegetation structure and composition. Results showed that NPs that occurred in higher density supported reasonable levels of biodiversity. In situations where NPs occurred in lower density, the presence of a larger green space such as a heritage park or botanical garden enhanced the biodiversity of these parks. In contrast, farm lands and home gardens which were common within the city are being lost at an unprecedented scale to developmental projects. However, there is also the emergence of a 'neo-culture' of home-gardening that promotes 'locovory' or consumption of locally grown food as a means to a sustainable living and reduced carbon footprint. This movement overcomes the space constraint by using vertical and terrace gardening techniques. Food that is grown within cities comprises of vegetables and fruits which are largely pollinator dependent. This goes hand in hand with our landscape-level study that has shown that cities support pollinator diversity. Maintaining and improving these man-made ecosystems requires analysing the functioning and characteristics of the existing structures of governance. Social network analysis tool was applied to NPs to examine relationships, between actors and ties. The management structures around NPs, gaps, and means to strengthen the networks from the current state to a near-ideal state were identified for enhanced services. Learnings from NPs were used to build a hypothetical governance structure and functioning of integrated governance of NPs and edible landscapes to enhance ecosystem services such as biodiversity support, food production, and aesthetic value. They also contribute to the sustainability axis of smart cities.

Keywords: biodiversity support, ecosystem services, edible green spaces, neighbourhood parks, sustainable smart city

Procedia PDF Downloads 130
24492 Bacterial Community Diversity in Soil under Two Tillage Systems

Authors: Dalia Ambrazaitienė, Monika Vilkienė, Danute Karcauskienė, Gintaras Siaudinis

Abstract:

The soil is a complex ecosystem that is part of our biosphere. The ability of soil to provide ecosystem services is dependent on microbial diversity. T Tillage is one of the major factors that affect soil properties. The no-till systems or shallow ploughless tillage are opposite of traditional deep ploughing, no-tillage systems, for instance, increase soil organic matter by reducing mineralization rates and stimulating litter concentrations of the top soil layer, whereas deep ploughing increases the biological activity of arable soil layer and reduces the incidence of weeds. The role of soil organisms is central to soil processes. Although the number of microbial species in soil is still being debated, the metagenomic approach to estimate microbial diversity predicted about 2000 – 18 000 bacterial genomes in 1 g of soil. Despite the key role of bacteria in soil processes, there is still lack of information about the bacterial diversity of soils as affected by tillage practices. This study focused on metagenomic analysis of bacterial diversity in long-term experimental plots of Dystric Epihypogleyic Albeluvisols in western part of Lithuania. The experiment was set up in 2013 and had a split-plot design where the whole-plot treatments were laid out in a randomized design with three replicates. The whole-plot treatments consisted of two tillage methods - deep ploughing (22-25 cm) (DP), ploughless tillage (7-10 cm) (PT). Three subsamples (0-20 cm) were collected on October 22, 2015 for each of the three replicates. Subsamples from the DP and PT systems were pooled together wise to make two composition samples, one representing deep ploughing (DP) and the other ploughless tillage (PT). Genomic DNA from soil sample was extracted from approximately 200 mg field-moist soil by using the D6005 Fungal/Bacterial Miniprep set (Zymo Research®) following the manufacturer’s instructions. To determine bacterial diversity and community composition, we employed a culture – independent approach of high-throughput pyrosequencing of the 16S rRNA gene. Metagenomic sequencing was made with Illumina MiSeq platform in Base Clear Company. The microbial component of soil plays a crucial role in cycling of nutrients in biosphere. Our study was a preliminary attempt at observing bacterial diversity in soil under two common but contrasting tillage practices. The number of sequenced reads obtained for PT (161 917) was higher than DP (131 194). The 10 most abundant genus in soil sample were the same (Arthrobacter, Candidatus Saccharibacteria, Actinobacteria, Acidobacterium, Mycobacterium, Bacillus, Alphaproteobacteria, Longilinea, Gemmatimonas, Solirubrobacter), just the percent of community part was different. In DP the Arthrobacter and Acidobacterium consist respectively 8.4 % and 2.5%, meanwhile in PT just 5.8% and 2.1% of all community. The Nocardioides and Terrabacter were observed just in PT. This work was supported by the project VP1-3.1-ŠMM-01-V-03-001 NKPDOKT and National Science Program: The effect of long-term, different-intensity management of resources on the soils of different genesis and on other components of the agro-ecosystems [grant number SIT-9/2015] funded by the Research Council of Lithuania.

Keywords: deep ploughing, metagenomics, ploughless tillage, soil community analysis

Procedia PDF Downloads 232
24491 Cognitive Science Based Scheduling in Grid Environment

Authors: N. D. Iswarya, M. A. Maluk Mohamed, N. Vijaya

Abstract:

Grid is infrastructure that allows the deployment of distributed data in large size from multiple locations to reach a common goal. Scheduling data intensive applications becomes challenging as the size of data sets are very huge in size. Only two solutions exist in order to tackle this challenging issue. First, computation which requires huge data sets to be processed can be transferred to the data site. Second, the required data sets can be transferred to the computation site. In the former scenario, the computation cannot be transferred since the servers are storage/data servers with little or no computational capability. Hence, the second scenario can be considered for further exploration. During scheduling, transferring huge data sets from one site to another site requires more network bandwidth. In order to mitigate this issue, this work focuses on incorporating cognitive science in scheduling. Cognitive Science is the study of human brain and its related activities. Current researches are mainly focused on to incorporate cognitive science in various computational modeling techniques. In this work, the problem solving approach of human brain is studied and incorporated during the data intensive scheduling in grid environments. Here, a cognitive engine is designed and deployed in various grid sites. The intelligent agents present in CE will help in analyzing the request and creating the knowledge base. Depending upon the link capacity, decision will be taken whether to transfer data sets or to partition the data sets. Prediction of next request is made by the agents to serve the requesting site with data sets in advance. This will reduce the data availability time and data transfer time. Replica catalog and Meta data catalog created by the agents assist in decision making process.

Keywords: data grid, grid workflow scheduling, cognitive artificial intelligence

Procedia PDF Downloads 382
24490 Heritage and Tourism in the Era of Big Data: Analysis of Chinese Cultural Tourism in Catalonia

Authors: Xinge Liao, Francesc Xavier Roige Ventura, Dolores Sanchez Aguilera

Abstract:

With the development of the Internet, the study of tourism behavior has rapidly expanded from the traditional physical market to the online market. Data on the Internet is characterized by dynamic changes, and new data appear all the time. In recent years the generation of a large volume of data was characterized, such as forums, blogs, and other sources, which have expanded over time and space, together they constitute large-scale Internet data, known as Big Data. This data of technological origin that derives from the use of devices and the activity of multiple users is becoming a source of great importance for the study of geography and the behavior of tourists. The study will focus on cultural heritage tourist practices in the context of Big Data. The research will focus on exploring the characteristics and behavior of Chinese tourists in relation to the cultural heritage of Catalonia. Geographical information, target image, perceptions in user-generated content will be studied through data analysis from Weibo -the largest social networks of blogs in China. Through the analysis of the behavior of heritage tourists in the Big Data environment, this study will understand the practices (activities, motivations, perceptions) of cultural tourists and then understand the needs and preferences of tourists in order to better guide the sustainable development of tourism in heritage sites.

Keywords: Barcelona, Big Data, Catalonia, cultural heritage, Chinese tourism market, tourists’ behavior

Procedia PDF Downloads 124
24489 Towards A Framework for Using Open Data for Accountability: A Case Study of A Program to Reduce Corruption

Authors: Darusalam, Jorish Hulstijn, Marijn Janssen

Abstract:

Media has revealed a variety of corruption cases in the regional and local governments all over the world. Many governments pursued many anti-corruption reforms and have created a system of checks and balances. Three types of corruption are faced by citizens; administrative corruption, collusion and extortion. Accountability is one of the benchmarks for building transparent government. The public sector is required to report the results of the programs that have been implemented so that the citizen can judge whether the institution has been working such as economical, efficient and effective. Open Data is offering solutions for the implementation of good governance in organizations who want to be more transparent. In addition, Open Data can create transparency and accountability to the community. The objective of this paper is to build a framework of open data for accountability to combating corruption. This paper will investigate the relationship between open data, and accountability as part of anti-corruption initiatives. This research will investigate the impact of open data implementation on public organization.

Keywords: open data, accountability, anti-corruption, framework

Procedia PDF Downloads 316
24488 Assessment of Trace Metal Concentration of Soils Contaminated with Carbide in Abraka, Delta State, Nigeria

Authors: O.M. Agbogidi, I.M. Onochie

Abstract:

An investigation was carried out on trace metal concentration of soils contaminated with carbide in Abraka, Delta State, Nigeria in 2014 with a view to providing baseline formation on their status relative to the control plants and to the tolerable limits recommended by World standard bodies including WHO and FAO. The metals were analyzed using the Atomic Absorption Spectrophotometer which showed an elevated level when compared with the control plots. High level of metals including Fe, Pb, Zn, Cu, Cd, Ni, Cr and arsenic were recorded and these values were significantly different (P<0.05) from values obtained from the control plots. These results are indicative of the fact that carbide polluted soil had higher level of trace metals and because these metals are non-biodegradable elements in the ecosystem, a rise to their lethal levels in food chains is envisaged due to the interdependency of plants and animals stemming from soil-water organisms interrelationship.

Keywords: bio-concentration, carbide contaminated soils, heavy metals, trace metals

Procedia PDF Downloads 264
24487 Wastes of Oil Drilling: Treatment Techniques and Their Effectiveness

Authors: Abbas Hadj Abbas, Hacini Massaoud, Aiad Lahcen

Abstract:

In Hassi-Messoud’s oil industry, the systems which are water based (WBM) are generally used for drilling in the first phase. For the rest of the well, the oil mud systems are employed (OBM). In the field of oil exploration, panoply of chemical products is employed in the drilling fluids formulation. These components of different natures and whose toxicity and biodegradability are of ill-defined parameters are; however, thrown into nature. In addition to the hydrocarbon (HC, such as diesel) which is a major constituent of oil based mud, we also can notice spills as well as a variety of other products and additives on the drilling sites. These wastes are usually stored in places called (crud wastes). These may cause major problems to the ecosystem. To treat these wastes, we have considered two methods which are: solidification/ stabilization (chemical) and thermal. So that we can evaluate the techniques of treatment, a series of analyses are performed on dozens of specimens of wastes before treatment. After that, and on the basis of our analyses of wastes, we opted for diagnostic treatments of pollution before and after solidification and stabilization. Finally, we have done some analyses before and after the thermal treatment to check the efficiency of the methods followed in the study.

Keywords: wastes treatment, the oil pollution, the norms, wastes drilling

Procedia PDF Downloads 278
24486 Oil Contaminate Removal from Wastewater with Novel Nanofiber-Based Membranes

Authors: Zhaoyang Liu

Abstract:

Oil pollution is typically caused by oil and gas-related operations such as vessel accidents, which can pollute waterways as well as the environment and damage the ecosystem. Tanker ship cleaning contributes to oil spills, which have a negative impact on coastal countries due to protracted service disruption. It is critical for coastal countries to develop efficient oil taint cleanup technology. There are various oil/water separation technologies, such as gravity separation, hydrocyclone, air flotation, and membrane filtration, among others. Among these, membrane filtration has been shown to produce high-quality effluent. Commercial membranes, on the other hand, nevertheless face significant practical challenges, such as a high susceptibility for membrane fouling when dealing with greasy effluent. We developed a unique anti-fouling filtering membrane for oil/water separation in this work. The membrane was made of inorganic nanofibers, which possesses the advantages of low membrane fouling, high permeation flux and long-term durability. This results from this study could facilitate to pave a new way for membranes filtration’s practical applications in oil/gas industry.

Keywords: oil, contaminate, wastewater, removal

Procedia PDF Downloads 61
24485 Syndromic Surveillance Framework Using Tweets Data Analytics

Authors: David Ming Liu, Benjamin Hirsch, Bashir Aden

Abstract:

Syndromic surveillance is to detect or predict disease outbreaks through the analysis of medical sources of data. Using social media data like tweets to do syndromic surveillance becomes more and more popular with the aid of open platform to collect data and the advantage of microblogging text and mobile geographic location features. In this paper, a Syndromic Surveillance Framework is presented with machine learning kernel using tweets data analytics. Influenza and the three cities Abu Dhabi, Al Ain and Dubai of United Arabic Emirates are used as the test disease and trial areas. Hospital cases data provided by the Health Authority of Abu Dhabi (HAAD) are used for the correlation purpose. In our model, Latent Dirichlet allocation (LDA) engine is adapted to do supervised learning classification and N-Fold cross validation confusion matrix are given as the simulation results with overall system recall 85.595% performance achieved.

Keywords: Syndromic surveillance, Tweets, Machine Learning, data mining, Latent Dirichlet allocation (LDA), Influenza

Procedia PDF Downloads 102
24484 Divalent Iron Oxidative Process for Degradation of Carbon and Nitrogen Based Pollutants from Dye Intermediate Industrial Wastewater

Authors: Nibedita Pani, Vishnu Tejani, T. S. Anantha Singh

Abstract:

Water pollution resulting from discharge of partial/not treated textile wastewater containing high carbon and nitrogen pollutants pose a huge threat to the environment, ecosystem, and human health. It is essential to remove carbon- and nitrogen-based organic pollutants more effectively from industrial wastewater before discharging. The present study focuses on removal of carbon-based pollutant in particular COD (chemical oxygen demand) and nitrogen-based pollutants, in particular, ammoniacal nitrogen by Fenton oxidation process using Fe²⁺ and H₂O₂ as reagents. The study was carried out with high strength wastewater containing initial COD 5632 mg/L and NH⁴⁺-N 1372 mg/L. The major operating condition like pH was varied between 1.0 to 4.0. The maximum degradation was obtained at pH 3.0 taking the molar ratio of Fe²⁺/H₂O₂ as 1:1. At this pH, the removal efficiencies of COD and ammoniacal nitrogen were found to be 77.27% and 74.9%, respectively. The Fenton process can be the best alternative for the simultaneous removal of COD and NH4+-N from industrial wastewater.

Keywords: ammoniacal nitrogen, COD, Fenton oxidation, industrial wastewater

Procedia PDF Downloads 183
24483 Analysis of Urban Population Using Twitter Distribution Data: Case Study of Makassar City, Indonesia

Authors: Yuyun Wabula, B. J. Dewancker

Abstract:

In the past decade, the social networking app has been growing very rapidly. Geolocation data is one of the important features of social media that can attach the user's location coordinate in the real world. This paper proposes the use of geolocation data from the Twitter social media application to gain knowledge about urban dynamics, especially on human mobility behavior. This paper aims to explore the relation between geolocation Twitter with the existence of people in the urban area. Firstly, the study will analyze the spread of people in the particular area, within the city using Twitter social media data. Secondly, we then match and categorize the existing place based on the same individuals visiting. Then, we combine the Twitter data from the tracking result and the questionnaire data to catch the Twitter user profile. To do that, we used the distribution frequency analysis to learn the visitors’ percentage. To validate the hypothesis, we compare it with the local population statistic data and land use mapping released by the city planning department of Makassar local government. The results show that there is the correlation between Twitter geolocation and questionnaire data. Thus, integration the Twitter data and survey data can reveal the profile of the social media users.

Keywords: geolocation, Twitter, distribution analysis, human mobility

Procedia PDF Downloads 301
24482 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining

Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser

Abstract:

Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.

Keywords: classification, coronary artery disease, data-mining, knowledge discovery, extract

Procedia PDF Downloads 643
24481 Sensor Data Analysis for a Large Mining Major

Authors: Sudipto Shanker Dasgupta

Abstract:

One of the largest mining companies wanted to look at health analytics for their driverless trucks. These trucks were the key to their supply chain logistics. The automated trucks had multi-level sub-assemblies which would send out sensor information. The use case that was worked on was to capture the sensor signal from the truck subcomponents and analyze the health of the trucks from repair and replacement purview. Open source software was used to stream the data into a clustered Hadoop setup in Amazon Web Services cloud and Apache Spark SQL was used to analyze the data. All of this was achieved through a 10 node amazon 32 core, 64 GB RAM setup real-time analytics was achieved on ‘300 million records’. To check the scalability of the system, the cluster was increased to 100 node setup. This talk will highlight how Open Source software was used to achieve the above use case and the insights on the high data throughput on a cloud set up.

Keywords: streaming analytics, data science, big data, Hadoop, high throughput, sensor data

Procedia PDF Downloads 395
24480 A Highly Sensitive Dip Strip for Detection of Phosphate in Water

Authors: Hojat Heidari-Bafroui, Amer Charbaji, Constantine Anagnostopoulos, Mohammad Faghri

Abstract:

Phosphorus is an essential nutrient for plant life which is most frequently found as phosphate in water. Once phosphate is found in abundance in surface water, a series of adverse effects on an ecosystem can be initiated. Therefore, a portable and reliable method is needed to monitor the phosphate concentrations in the field. In this paper, an inexpensive dip strip device with the ascorbic acid/antimony reagent dried on blotting paper along with wet chemistry is developed for the detection of low concentrations of phosphate in water. Ammonium molybdate and sulfuric acid are separately stored in liquid form so as to improve significantly the lifetime of the device and enhance the reproducibility of the device’s performance. The limit of detection and quantification for the optimized device are 0.134 ppm and 0.472 ppm for phosphate in water, respectively. The device’s shelf life, storage conditions, and limit of detection are superior to what has been previously reported for the paper-based phosphate detection devices.

Keywords: phosphate detection, paper-based device, molybdenum blue method, colorimetric assay

Procedia PDF Downloads 158
24479 Data-Centric Anomaly Detection with Diffusion Models

Authors: Sheldon Liu, Gordon Wang, Lei Liu, Xuefeng Liu

Abstract:

Anomaly detection, also referred to as one-class classification, plays a crucial role in identifying product images that deviate from the expected distribution. This study introduces Data-centric Anomaly Detection with Diffusion Models (DCADDM), presenting a systematic strategy for data collection and further diversifying the data with image generation via diffusion models. The algorithm addresses data collection challenges in real-world scenarios and points toward data augmentation with the integration of generative AI capabilities. The paper explores the generation of normal images using diffusion models. The experiments demonstrate that with 30% of the original normal image size, modeling in an unsupervised setting with state-of-the-art approaches can achieve equivalent performances. With the addition of generated images via diffusion models (10% equivalence of the original dataset size), the proposed algorithm achieves better or equivalent anomaly localization performance.

Keywords: diffusion models, anomaly detection, data-centric, generative AI

Procedia PDF Downloads 72
24478 Assessment of the Possible Effects of Biological Control Agents of Lantana camara and Chromolaena odorata in Davao City, Mindanao, Philippines

Authors: Cristine P. Canlas, Crislene Mae L. Gever, Patricia Bea R. Rosialda, Ma. Nina Regina M. Quibod, Perry Archival C. Buenavente, Normandy M. Barbecho, Cynthia Adeline A. Layusa, Michael Day

Abstract:

Invasive plants have an impact on global biodiversity and ecosystem function, and their management is a complex and formidable task. Two of these invasive plant species, Lantana camara and Chromolaena odorata, are found in the Philippines. Lantana camara has the ability to suppress the growth of and outcompete neighboring plants. Chromolaena odorata causes serious agricultural and economical damage and causes fire hazards during dry season. In addition, both species has been reported to poison livestock. One of the known global management strategies to control invasive plants is the introduction of biological control agents. These natural enemies of the invasive plants reduce population density and impacts of the invasive plants, resulting in the balance of the nature in their invasion. Through secondary data sources, interviews, and field validation (e.g. microhabitat searches, sweep netting, opportunistic sampling, photo-documentation), we investigated whether the biocontrol agents previously released by the Philippine Coconut Authority (PCA) in their Davao Research Center to control these invasive plants are still present and are affecting their respective host weeds. We confirm the presence of the biocontrol agent of L. camara, Uroplata girardi, which was introduced in 1985, and Cecidochares connexa, a biocontrol agent of C. odorata released in 2003. Four other biocontrol agents were found to affect L. camara. Signs of damage (e.g. stem galls in C. odorata, and leaf mines in L. camara) signify that these biocontrol agents have successfully established outside of their release site in Davao. Further investigating the extent of the spread of these biocontrol agents in the Philippines and their damage to the two weeds will contribute to the management of invasive plant species in the country.

Keywords: invasive alien species, biological control agent, entomology, worst weeds

Procedia PDF Downloads 363
24477 Regulation on the Protection of Personal Data Versus Quality Data Assurance in the Healthcare System Case Report

Authors: Elizabeta Krstić Vukelja

Abstract:

Digitization of personal data is a consequence of the development of information and communication technologies that create a new work environment with many advantages and challenges, but also potential threats to privacy and personal data protection. Regulation (EU) 2016/679 of the European Parliament and of the Council is becoming a law and obligation that should address the issues of personal data protection and information security. The existence of the Regulation leads to the conclusion that national legislation in the field of virtual environment, protection of the rights of EU citizens and processing of their personal data is insufficiently effective. In the health system, special emphasis is placed on the processing of special categories of personal data, such as health data. The healthcare industry is recognized as a particularly sensitive area in which a large amount of medical data is processed, the digitization of which enables quick access and quick identification of the health insured. The protection of the individual requires quality IT solutions that guarantee the technical protection of personal categories. However, the real problems are the technical and human nature and the spatial limitations of the application of the Regulation. Some conclusions will be drawn by analyzing the implementation of the basic principles of the Regulation on the example of the Croatian health care system and comparing it with similar activities in other EU member states.

Keywords: regulation, healthcare system, personal dana protection, quality data assurance

Procedia PDF Downloads 28
24476 Parallel Vector Processing Using Multi Level Orbital DATA

Authors: Nagi Mekhiel

Abstract:

Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.

Keywords: Memory Organization, Parallel Processors, Serial Code, Vector Processing

Procedia PDF Downloads 255
24475 Reconstructability Analysis for Landslide Prediction

Authors: David Percy

Abstract:

Landslides are a geologic phenomenon that affects a large number of inhabited places and are constantly being monitored and studied for the prediction of future occurrences. Reconstructability analysis (RA) is a methodology for extracting informative models from large volumes of data that work exclusively with discrete data. While RA has been used in medical applications and social science extensively, we are introducing it to the spatial sciences through applications like landslide prediction. Since RA works exclusively with discrete data, such as soil classification or bedrock type, working with continuous data, such as porosity, requires that these data are binned for inclusion in the model. RA constructs models of the data which pick out the most informative elements, independent variables (IVs), from each layer that predict the dependent variable (DV), landslide occurrence. Each layer included in the model retains its classification data as a primary encoding of the data. Unlike other machine learning algorithms that force the data into one-hot encoding type of schemes, RA works directly with the data as it is encoded, with the exception of continuous data, which must be binned. The usual physical and derived layers are included in the model, and testing our results against other published methodologies, such as neural networks, yields accuracy that is similar but with the advantage of a completely transparent model. The results of an RA session with a data set are a report on every combination of variables and their probability of landslide events occurring. In this way, every combination of informative state combinations can be examined.

Keywords: reconstructability analysis, machine learning, landslides, raster analysis

Procedia PDF Downloads 48
24474 Forest Fire Risk Mapping Using Analytic Hierarchy Process and GIS-Based Application: A Case Study in Hua Sai District, Thailand

Authors: Narissara Nuthammachot, Dimitris Stratoulias

Abstract:

Fire is one of the main causes of environmental and ecosystem change. Therefore, it is a challenging task for fire risk assessment fire potential mapping. The study area is Hua Sai district, Nakorn Sri Thammarat province, which covers in a part of peat swamp forest areas. 55 fire points in peat swamp areas were reported from 2012 to 2016. Analytic Hierarchy Process (AHP) and Geographic Information System (GIS) methods were selected for this study. The risk fire area map was arranged on these factors; elevation, slope, aspect, precipitation, distance from the river, distance from town, and land use. The results showed that the predicted fire risk areas are found to be in appreciable reliability with past fire events. The fire risk map can be used for the planning and management of fire areas in the future.

Keywords: analytic hierarchy process, fire risk assessment, geographic information system, peat swamp forest

Procedia PDF Downloads 194
24473 Data Analytics in Hospitality Industry

Authors: Tammy Wee, Detlev Remy, Arif Perdana

Abstract:

In the recent years, data analytics has become the buzzword in the hospitality industry. The hospitality industry is another example of a data-rich industry that has yet fully benefited from the insights of data analytics. Effective use of data analytics can change how hotels operate, market and position themselves competitively in the hospitality industry. However, at the moment, the data obtained by individual hotels remain under-utilized. This research is a preliminary research on data analytics in the hospitality industry, using an in-depth face-to-face interview on one hotel as a start to a multi-level research. The main case study of this research, hotel A, is a chain brand of international hotel that has been systematically gathering and collecting data on its own customer for the past five years. The data collection points begin from the moment a guest book a room until the guest leave the hotel premises, which includes room reservation, spa booking, and catering. Although hotel A has been gathering data intelligence on its customer for some time, they have yet utilized the data to its fullest potential, and they are aware of their limitation as well as the potential of data analytics. Currently, the utilization of data analytics in hotel A is limited in the area of customer service improvement, namely to enhance the personalization of service for each individual customer. Hotel A is able to utilize the data to improve and enhance their service which in turn, encourage repeated customers. According to hotel A, 50% of their guests returned to their hotel, and 70% extended nights because of the personalized service. Apart from using the data analytics for enhancing customer service, hotel A also uses the data in marketing. Hotel A uses the data analytics to predict or forecast the change in consumer behavior and demand, by tracking their guest’s booking preference, payment preference and demand shift between properties. However, hotel A admitted that the data they have been collecting was not fully utilized due to two challenges. The first challenge of using data analytics in hotel A is the data is not clean. At the moment, the data collection of one guest profile is meaningful only for one department in the hotel but meaningless for another department. Cleaning up the data and getting standards correctly for usage by different departments are some of the main concerns of hotel A. The second challenge of using data analytics in hotel A is the non-integral internal system. At the moment, the internal system used by hotel A do not integrate with each other well, limiting the ability to collect data systematically. Hotel A is considering another system to replace the current one for more comprehensive data collection. Hotel proprietors recognized the potential of data analytics as reported in this research, however, the current challenges of implementing a system to collect data come with a cost. This research has identified the current utilization of data analytics and the challenges faced when it comes to implementing data analytics.

Keywords: data analytics, hospitality industry, customer relationship management, hotel marketing

Procedia PDF Downloads 164
24472 Realization of a (GIS) for Drilling (DWS) through the Adrar Region

Authors: Djelloul Benatiallah, Ali Benatiallah, Abdelkader Harouz

Abstract:

Geographic Information Systems (GIS) include various methods and computer techniques to model, capture digitally, store, manage, view and analyze. Geographic information systems have the characteristic to appeal to many scientific and technical field, and many methods. In this article we will present a complete and operational geographic information system, following the theoretical principles of data management and adapting to spatial data, especially data concerning the monitoring of drinking water supply wells (DWS) Adrar region. The expected results of this system are firstly an offer consulting standard features, updating and editing beneficiaries and geographical data, on the other hand, provides specific functionality contractors entered data, calculations parameterized and statistics.

Keywords: GIS, DWS, drilling, Adrar

Procedia PDF Downloads 295
24471 Geospatial Assessments on Impacts of Land Use Changes and Climate Change in Nigeria Forest Ecosystems

Authors: Samuel O. Akande

Abstract:

The human-induced climate change is likely to have severe consequences on forest ecosystems in Nigeria. Recent discussions and emphasis on issues concerning the environment justify the need for this research which examined deforestation monitoring in Oban Forest, Nigeria using Remote Sensing techniques. The Landsat images from TM (1986), ETM+ (2001) and OLI (2015) sensors were obtained from Landsat online archive and processed using Erdas Imagine 2014 and ArcGIS 10.3 to obtain the land use/land cover and Normalized Differential Vegetative Index (NDVI) values. Ground control points of deforested areas were collected for validation. It was observed that the forest cover decreased in area by about 689.14 km² between 1986 and 2015. The NDVI was used to determine the vegetation health of the forest and its implications on agricultural sustainability. The result showed that the total percentage of the healthy forest cover has reduced to about 45.9% from 1986 to 2015. The results obtained from analysed questionnaires shown that there was a positive correlation between the causes and effects of deforestation in the study area. The coefficient of determination value was calculated as R² ≥ 0.7, to ascertain the level of anthropogenic activities, such as fuelwood harvesting, intensive farming, and logging, urbanization, and engineering construction activities, responsible for deforestation in the study area. Similarly, temperature and rainfall data were obtained from Nigerian Meteorological Agency (NIMET) for the period of 1986 to 2015 in the study area. It was observed that there was a significant increase in temperature while rainfall decreased over the study area. Responses from the administered questionnaires also showed that futile destruction of forest ecosystem in Oban forest could be reduced to its barest minimum if fuelwood harvesting is disallowed. Thus, the projected impacts of climate change on Nigeria’s forest ecosystems and environmental stability is better imagined than experienced.

Keywords: deforestation, ecosystems, normalized differential vegetative index, sustainability

Procedia PDF Downloads 180
24470 Generic Data Warehousing for Consumer Electronics Retail Industry

Authors: S. Habte, K. Ouazzane, P. Patel, S. Patel

Abstract:

The dynamic and highly competitive nature of the consumer electronics retail industry means that businesses in this industry are experiencing different decision making challenges in relation to pricing, inventory control, consumer satisfaction and product offerings. To overcome the challenges facing retailers and create opportunities, we propose a generic data warehousing solution which can be applied to a wide range of consumer electronics retailers with a minimum configuration. The solution includes a dimensional data model, a template SQL script, a high level architectural descriptions, ETL tool developed using C#, a set of APIs, and data access tools. It has been successfully applied by ASK Outlets Ltd UK resulting in improved productivity and enhanced sales growth.

Keywords: consumer electronics, data warehousing, dimensional data model, generic, retail industry

Procedia PDF Downloads 399
24469 Sequential Data Assimilation with High-Frequency (HF) Radar Surface Current

Authors: Lei Ren, Michael Hartnett, Stephen Nash

Abstract:

The abundant measured surface current from HF radar system in coastal area is assimilated into model to improve the modeling forecasting ability. A simple sequential data assimilation scheme, Direct Insertion (DI), is applied to update model forecast states. The influence of Direct Insertion data assimilation over time is analyzed at one reference point. Vector maps of surface current from models are compared with HF radar measurements. Root-Mean-Squared-Error (RMSE) between modeling results and HF radar measurements is calculated during the last four days with no data assimilation.

Keywords: data assimilation, CODAR, HF radar, surface current, direct insertion

Procedia PDF Downloads 556