Search results for: Data Mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25091

Search results for: Data Mining

24281 Cloud Design for Storing Large Amount of Data

Authors: M. Strémy, P. Závacký, P. Cuninka, M. Juhás

Abstract:

Main goal of this paper is to introduce our design of private cloud for storing large amount of data, especially pictures, and to provide good technological backend for data analysis based on parallel processing and business intelligence. We have tested hypervisors, cloud management tools, storage for storing all data and Hadoop to provide data analysis on unstructured data. Providing high availability, virtual network management, logical separation of projects and also rapid deployment of physical servers to our environment was also needed.

Keywords: cloud, glusterfs, hadoop, juju, kvm, maas, openstack, virtualization

Procedia PDF Downloads 348
24280 Estimation of Missing Values in Aggregate Level Spatial Data

Authors: Amitha Puranik, V. S. Binu, Seena Biju

Abstract:

Missing data is a common problem in spatial analysis especially at the aggregate level. Missing can either occur in covariate or in response variable or in both in a given location. Many missing data techniques are available to estimate the missing data values but not all of these methods can be applied on spatial data since the data are autocorrelated. Hence there is a need to develop a method that estimates the missing values in both response variable and covariates in spatial data by taking account of the spatial autocorrelation. The present study aims to develop a model to estimate the missing data points at the aggregate level in spatial data by accounting for (a) Spatial autocorrelation of the response variable (b) Spatial autocorrelation of covariates and (c) Correlation between covariates and the response variable. Estimating the missing values of spatial data requires a model that explicitly account for the spatial autocorrelation. The proposed model not only accounts for spatial autocorrelation but also utilizes the correlation that exists between covariates, within covariates and between a response variable and covariates. The precise estimation of the missing data points in spatial data will result in an increased precision of the estimated effects of independent variables on the response variable in spatial regression analysis.

Keywords: spatial regression, missing data estimation, spatial autocorrelation, simulation analysis

Procedia PDF Downloads 371
24279 Immunization-Data-Quality in Public Health Facilities in the Pastoralist Communities: A Comparative Study Evidence from Afar and Somali Regional States, Ethiopia

Authors: Melaku Tsehay

Abstract:

The Consortium of Christian Relief and Development Associations (CCRDA), and the CORE Group Polio Partners (CGPP) Secretariat have been working with Global Alliance for Vac-cines and Immunization (GAVI) to improve the immunization data quality in Afar and Somali Regional States. The main aim of this study was to compare the quality of immunization data before and after the above interventions in health facilities in the pastoralist communities in Ethiopia. To this end, a comparative-cross-sectional study was conducted on 51 health facilities. The baseline data was collected in May 2019, while the end line data in August 2021. The WHO data quality self-assessment tool (DQS) was used to collect data. A significant improvment was seen in the accuracy of the pentavalent vaccine (PT)1 (p = 0.012) data at the health posts (HP), while PT3 (p = 0.010), and Measles (p = 0.020) at the health centers (HC). Besides, a highly sig-nificant improvment was observed in the accuracy of tetanus toxoid (TT)2 data at HP (p < 0.001). The level of over- or under-reporting was found to be < 8%, at the HP, and < 10% at the HC for PT3. The data completeness was also increased from 72.09% to 88.89% at the HC. Nearly 74% of the health facilities timely reported their respective immunization data, which is much better than the baseline (7.1%) (p < 0.001). These findings may provide some hints for the policies and pro-grams targetting on improving immunization data qaulity in the pastoralist communities.

Keywords: data quality, immunization, verification factor, pastoralist region

Procedia PDF Downloads 101
24278 Identifying Critical Success Factors for Data Quality Management through a Delphi Study

Authors: Maria Paula Santos, Ana Lucas

Abstract:

Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.

Keywords: critical success factors, data quality, data quality management, Delphi, Q-Sort

Procedia PDF Downloads 208
24277 A Recognition Method for Spatio-Temporal Background in Korean Historical Novels

Authors: Seo-Hee Kim, Kee-Won Kim, Seung-Hoon Kim

Abstract:

The most important elements of a novel are the characters, events and background. The background represents the time, place and situation that character appears, and conveys event and atmosphere more realistically. If readers have the proper knowledge about background of novels, it may be helpful for understanding the atmosphere of a novel and choosing a novel that readers want to read. In this paper, we are targeting Korean historical novels because spatio-temporal background especially performs an important role in historical novels among the genre of Korean novels. To the best of our knowledge, we could not find previous study that was aimed at Korean novels. In this paper, we build a Korean historical national dictionary. Our dictionary has historical places and temple names of kings over many generations as well as currently existing spatial words or temporal words in Korean history. We also present a method for recognizing spatio-temporal background based on patterns of phrasal words in Korean sentences. Our rules utilize postposition for spatial background recognition and temple names for temporal background recognition. The knowledge of the recognized background can help readers to understand the flow of events and atmosphere, and can use to visualize the elements of novels.

Keywords: data mining, Korean historical novels, Korean linguistic feature, spatio-temporal background

Procedia PDF Downloads 272
24276 Slope Stability Assessment in Metasedimentary Deposit of an Opencast Mine: The Case of the Dikuluwe-Mashamba (DIMA) Mine in the DR Congo

Authors: Dina Kon Mushid, Sage Ngoie, Tshimbalanga Madiba, Kabutakapua Kakanda

Abstract:

Slope stability assessment is still the biggest challenge in mining activities and civil engineering structures. The slope in an opencast mine frequently reaches multiple weak layers that lead to the instability of the pit. Faults and soft layers throughout the rock would increase weathering and erosion rates. Therefore, it is essential to investigate the stability of the complex strata to figure out how stable they are. In the Dikuluwe-Mashamba (DIMA) area, the lithology of the stratum is a set of metamorphic rocks whose parent rocks are sedimentary rocks with a low degree of metamorphism. Thus, due to the composition and metamorphism of the parent rock, the rock formation is different in hardness and softness, which means that when the content of dolomitic and siliceous is high, the rock is hard. It is softer when the content of argillaceous and sandy is high. Therefore, from the vertical direction, it appears as a weak and hard layer, and from the horizontal direction, it seems like a smooth and hard layer in the same rock layer. From the structural point of view, the main structures in the mining area are the Dikuluwe dipping syncline and the Mashamba dipping anticline, and the occurrence of rock formations varies greatly. During the folding process of the rock formation, the stress will concentrate on the soft layer, causing the weak layer to be broken. At the same time, the phenomenon of interlayer dislocation occurs. This article aimed to evaluate the stability of metasedimentary rocks of the Dikuluwe-Mashamba (DIMA) open-pit mine using limit equilibrium and stereographic methods Based on the presence of statistical structural planes, the stereographic projection was used to study the slope's stability and examine the discontinuity orientation data to identify failure zones along the mine. The results revealed that the slope angle is too steep, and it is easy to induce landslides. The numerical method's sensitivity analysis showed that the slope angle and groundwater significantly impact the slope safety factor. The increase in the groundwater level substantially reduces the stability of the slope. Among the factors affecting the variation in the rate of the safety factor, the bulk density of soil is greater than that of rock mass, the cohesion of soil mass is smaller than that of rock mass, and the friction angle in the rock mass is much larger than that in the soil mass. The analysis showed that the rock mass structure types are mostly scattered and fragmented; the stratum changes considerably, and the variation of rock and soil mechanics parameters is significant.

Keywords: slope stability, weak layer, safety factor, limit equilibrium method, stereography method

Procedia PDF Downloads 255
24275 Fractional, Component and Morphological Composition of Ambient Air Dust in the Areas of Mining Industry

Authors: S.V. Kleyn, S.Yu. Zagorodnov, А.А. Kokoulina

Abstract:

Technogenic emissions of the mining and processing complex are characterized by a high content of chemical components and solid dust particles. However, each industrial enterprise and the surrounding area have features that require refinement and parameterization. Numerous studies have shown the negative impact of fine dust PM10 and PM2.5 on the health, as well as the possibility of toxic components absorption, including heavy metals by dust particles. The target of the study was the quantitative assessment of the fractional and particle size composition of ambient air dust in the area of impact by primary magnesium production complex. Also, we tried to describe the morphology features of dust particles. Study methods. To identify the dust emission sources, the analysis of the production process has been carried out. The particulate composition of the emissions was measured using laser particle analyzer Microtrac S3500 (covered range of particle size is 20 nm to 2000 km). Particle morphology and the component composition were established by electron microscopy by scanning microscope of high resolution (magnification rate - 5 to 300 000 times) with X-ray fluorescence device S3400N ‘HITACHI’. The chemical composition was identified by X-ray analysis of the samples using an X-ray diffractometer XRD-700 ‘Shimadzu’. Determination of the dust pollution level was carried out using model calculations of emissions in the atmosphere dispersion. The calculations were verified by instrumental studies. Results of the study. The results demonstrated that the dust emissions of different technical processes are heterogeneous and fractional structure is complicated. The percentage of particle sizes up to 2.5 micrometres inclusive was ranged from 0.00 to 56.70%; particle sizes less than 10 microns inclusive – 0.00 - 85.60%; particle sizes greater than 10 microns - 14.40% -100.00%. During microscopy, the presence of nanoscale size particles has been detected. Studied dust particles are round, irregular, cubic and integral shapes. The composition of the dust includes magnesium, sodium, potassium, calcium, iron, chlorine. On the base of obtained results, it was performed the model calculations of dust emissions dispersion and establishment of the areas of fine dust РМ 10 and РМ 2.5 distribution. It was found that the dust emissions of fine powder fractions PM10 and PM2.5 are dispersed over large distances and beyond the border of the industrial site of the enterprise. The population living near the enterprise is exposed to the risk of diseases associated with dust exposure. Data are transferred to the economic entity to make decisions on the measures to minimize the risks. Exposure and risks indicators on the health are used to provide named patient health and preventive care to the citizens living in the area of negative impact of the facility.

Keywords: dust emissions, еxposure assessment, PM 10, PM 2.5

Procedia PDF Downloads 250
24274 Comprehensive Study of Data Science

Authors: Asifa Amara, Prachi Singh, Kanishka, Debargho Pathak, Akshat Kumar, Jayakumar Eravelly

Abstract:

Today's generation is totally dependent on technology that uses data as its fuel. The present study is all about innovations and developments in data science and gives an idea about how efficiently to use the data provided. This study will help to understand the core concepts of data science. The concept of artificial intelligence was introduced by Alan Turing in which the main principle was to create an artificial system that can run independently of human-given programs and can function with the help of analyzing data to understand the requirements of the users. Data science comprises business understanding, analyzing data, ethical concerns, understanding programming languages, various fields and sources of data, skills, etc. The usage of data science has evolved over the years. In this review article, we have covered a part of data science, i.e., machine learning. Machine learning uses data science for its work. Machines learn through their experience, which helps them to do any work more efficiently. This article includes a comparative study image between human understanding and machine understanding, advantages, applications, and real-time examples of machine learning. Data science is an important game changer in the life of human beings. Since the advent of data science, we have found its benefits and how it leads to a better understanding of people, and how it cherishes individual needs. It has improved business strategies, services provided by them, forecasting, the ability to attend sustainable developments, etc. This study also focuses on a better understanding of data science which will help us to create a better world.

Keywords: data science, machine learning, data analytics, artificial intelligence

Procedia PDF Downloads 75
24273 A Study of Mortars with Granulated Blast Furnace Slag as Fine Aggregate and Its Influence on Properties of Burnt Clay Brick Masonry

Authors: Vibha Venkataramu, B. V. Venkatarama Reddy

Abstract:

Natural river sand is the most preferred choice as fine aggregate in masonry mortars. Uncontrolled mining of sand from riverbeds for several decades has had detrimental effects on the environment. Several countries across the world have put strict restrictions on sand mining from riverbeds. However, in countries like India, the huge infrastructural boom has made the local construction industry to look for alternative materials to sand. This study aims at understanding the suitability of granulated blast furnace slag (GBS) as fine aggregates in masonry mortars. Apart from characterising the material properties of GBS, such as particle size distribution, pH, chemical composition, etc., of GBS, tests were performed on the mortars with GBS as fine aggregate. Additionally, the properties of five brick tall, stack bonded masonry prisms with various types of GBS mortars were studied. The mortars with mix proportions 1: 0: 6 (cement: lime: fine aggregate), 1: 1: 6, and 1: 0: 3 were considered for the study. Fresh and hardened properties of mortar, such as flow and compressive strength, were studied. To understand the behaviour of GBS mortars on masonry, tests such as compressive strength and flexure bond strength were performed on masonry prisms made with a different type of GBS mortars. Furthermore, the elastic properties of masonry with GBS mortars were also studied under compression. For comparison purposes, the properties of corresponding control mortars with natural sand as fine aggregate and masonry prisms with sand mortars were also studied under similar testing conditions. From the study, it was observed the addition of GBS negatively influenced the flow of mortars and positively influenced the compressive strength. The GBS mortars showed 20 to 25 % higher compressive strength at 28 days of age, compared to corresponding control mortars. Furthermore, masonry made with GBS mortars showed nearly 10 % higher compressive strengths compared to control specimens. But, the impact of GBS on the flexural strength of masonry was marginal.

Keywords: building materials, fine aggregate, granulated blast furnace slag in mortars, masonry properties

Procedia PDF Downloads 117
24272 The Clash between Environmental and Heritage Laws: An Australian Case Study

Authors: Andrew R. Beatty

Abstract:

The exploitation of Australia’s vast mineral wealth is regulated by a matrix of planning, environment and heritage legislation, and despite the desire for a ‘balance’ between economic, environmental and heritage values, Aboriginal objects and places are often detrimentally impacted by mining approvals. The Australian experience is not novel. There are other cases of clashes between the rights of traditional landowners and businesses seeking to exploit mineral or other resources on or beneath those lands, including in the United States, Canada, and Brazil. How one reconciles the rights of traditional owners with those of resource companies is an ongoing legal problem of general interest. In Australia, planning and environmental approvals for resource projects are ordinarily issued by State or Territory governments. Federal legislation such as the Aboriginal and Torres Strait Islander Heritage Protection Act 1984 (Cth) is intended to act as a safety net when State or Territory legislation is incapable of protecting Indigenous objects or places in the context of approvals for resource projects. This paper will analyse the context and effectiveness of legislation enacted to protect Indigenous heritage in the planning process. In particular, the paper will analyse how the statutory objects of such legislation need to be weighed against the statutory objects of competing legislation designed to facilitate and control resource exploitation. Using a current claim in the Federal Court of Australia for the protection of a culturally significant landscape as a case study, this paper will examine the challenges faced in ascribing value to cultural heritage within the wider context of environmental and planning laws. Our findings will reveal that there is an inherent difficulty in defining and weighing competing economic, environmental and heritage considerations. An alternative framework will be proposed to guide regulators towards making decisions that result in better protection of Indigenous heritage in the context of resource management.

Keywords: environmental law, heritage law, indigenous rights, mining

Procedia PDF Downloads 93
24271 DNpro: A Deep Learning Network Approach to Predicting Protein Stability Changes Induced by Single-Site Mutations

Authors: Xiao Zhou, Jianlin Cheng

Abstract:

A single amino acid mutation can have a significant impact on the stability of protein structure. Thus, the prediction of protein stability change induced by single site mutations is critical and useful for studying protein function and structure. Here, we presented a deep learning network with the dropout technique for predicting protein stability changes upon single amino acid substitution. While using only protein sequence as input, the overall prediction accuracy of the method on a standard benchmark is >85%, which is higher than existing sequence-based methods and is comparable to the methods that use not only protein sequence but also tertiary structure, pH value and temperature. The results demonstrate that deep learning is a promising technique for protein stability prediction. The good performance of this sequence-based method makes it a valuable tool for predicting the impact of mutations on most proteins whose experimental structures are not available. Both the downloadable software package and the user-friendly web server (DNpro) that implement the method for predicting protein stability changes induced by amino acid mutations are freely available for the community to use.

Keywords: bioinformatics, deep learning, protein stability prediction, biological data mining

Procedia PDF Downloads 454
24270 Interpreting Privacy Harms from a Non-Economic Perspective

Authors: Christopher Muhawe, Masooda Bashir

Abstract:

With increased Internet Communication Technology(ICT), the virtual world has become the new normal. At the same time, there is an unprecedented collection of massive amounts of data by both private and public entities. Unfortunately, this increase in data collection has been in tandem with an increase in data misuse and data breach. Regrettably, the majority of data breach and data misuse claims have been unsuccessful in the United States courts for the failure of proof of direct injury to physical or economic interests. The requirement to express data privacy harms from an economic or physical stance negates the fact that not all data harms are physical or economic in nature. The challenge is compounded by the fact that data breach harms and risks do not attach immediately. This research will use a descriptive and normative approach to show that not all data harms can be expressed in economic or physical terms. Expressing privacy harms purely from an economic or physical harm perspective negates the fact that data insecurity may result into harms which run counter the functions of privacy in our lives. The promotion of liberty, selfhood, autonomy, promotion of human social relations and the furtherance of the existence of a free society. There is no economic value that can be placed on these functions of privacy. The proposed approach addresses data harms from a psychological and social perspective.

Keywords: data breach and misuse, economic harms, privacy harms, psychological harms

Procedia PDF Downloads 190
24269 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 39
24268 Numerical Modelling of 3-D Fracture Propagation and Damage Evolution of an Isotropic Heterogeneous Rock with a Pre-Existing Surface Flaw under Uniaxial Compression

Authors: S. Mondal, L. M. Olsen-Kettle, L. Gross

Abstract:

Fracture propagation and damage evolution are extremely important for many industrial applications including mining industry, composite materials, earthquake simulations, hydraulic fracturing. The influence of pre-existing flaws and rock heterogeneity on the processes and mechanisms of rock fracture has important ramifications in many mining and reservoir engineering applications. We simulate the damage evolution and fracture propagation in an isotropic sandstone specimen containing a pre-existing 3-D surface flaw in different configurations under uniaxial compression. We apply a damage model based on the unified strength theory and solve the solid deformation and damage evolution equations using the Finite Element Method (FEM) with tetrahedron elements on unstructured meshes through the simulation software, eScript. Unstructured meshes provide higher geometrical flexibility and allow a more accurate way to model the varying flaw depth, angle, and length through locally adapted FEM meshes. The heterogeneity of rock is considered by initializing material properties using a Weibull distribution sampled over a cubic grid. In our model, we introduce a length scale related to the rock heterogeneity which is independent of the mesh size. We investigate the effect of parameters including the heterogeneity of the elastic moduli and geometry of the single flaw in the stress strain response. The generation of three typical surface cracking patterns, called wing cracks, anti-wing cracks and far-field cracks were identified, and these depend on the geometry of the pre-existing surface flaw. This model results help to advance our understanding of fracture and damage growth in heterogeneous rock with the aim to develop fracture simulators for different industry applications.

Keywords: finite element method, heterogeneity, isotropic damage, uniaxial compression

Procedia PDF Downloads 210
24267 Data Access, AI Intensity, and Scale Advantages

Authors: Chuping Lo

Abstract:

This paper presents a simple model demonstrating that ceteris paribus countries with lower barriers to accessing global data tend to earn higher incomes than other countries. Therefore, large countries that inherently have greater data resources tend to have higher incomes than smaller countries, such that the former may be more hesitant than the latter to liberalize cross-border data flows to maintain this advantage. Furthermore, countries with higher artificial intelligence (AI) intensity in production technologies tend to benefit more from economies of scale in data aggregation, leading to higher income and more trade as they are better able to utilize global data.

Keywords: digital intensity, digital divide, international trade, scale of economics

Procedia PDF Downloads 57
24266 Secured Transmission and Reserving Space in Images Before Encryption to Embed Data

Authors: G. R. Navaneesh, E. Nagarajan, C. H. Rajam Raju

Abstract:

Nowadays the multimedia data are used to store some secure information. All previous methods allocate a space in image for data embedding purpose after encryption. In this paper, we propose a novel method by reserving space in image with a boundary surrounded before encryption with a traditional RDH algorithm, which makes it easy for the data hider to reversibly embed data in the encrypted images. The proposed method can achieve real time performance, that is, data extraction and image recovery are free of any error. A secure transmission process is also discussed in this paper, which improves the efficiency by ten times compared to other processes as discussed.

Keywords: secure communication, reserving room before encryption, least significant bits, image encryption, reversible data hiding

Procedia PDF Downloads 408
24265 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 248
24264 A Review on Intelligent Systems for Geoscience

Authors: R Palson Kennedy, P.Kiran Sai

Abstract:

This article introduces machine learning (ML) researchers to the hurdles that geoscience problems present, as well as the opportunities for improvement in both ML and geosciences. This article presents a review from the data life cycle perspective to meet that need. Numerous facets of geosciences present unique difficulties for the study of intelligent systems. Geosciences data is notoriously difficult to analyze since it is frequently unpredictable, intermittent, sparse, multi-resolution, and multi-scale. The first half addresses data science’s essential concepts and theoretical underpinnings, while the second section contains key themes and sharing experiences from current publications focused on each stage of the data life cycle. Finally, themes such as open science, smart data, and team science are considered.

Keywords: Data science, intelligent system, machine learning, big data, data life cycle, recent development, geo science

Procedia PDF Downloads 129
24263 Geochemistry of Natural Radionuclides Associated with Acid Mine Drainage (AMD) in a Coal Mining Area in Southern Brazil

Authors: Juliana A. Galhardi, Daniel M. Bonotto

Abstract:

Coal is an important non-renewable energy source of and can be associated with radioactive elements. In Figueira city, Paraná state, Brazil, it was recorded high uranium activity near the coal mine that supplies a local thermoelectric power plant. In this context, the radon activity (Rn-222, produced by the Ra-226 decay in the U-238 natural series) was evaluated in groundwater, river water and effluents produced from the acid mine drainage in the coal reject dumps. The samples were collected in August 2013 and in February 2014 and analyzed at LABIDRO (Laboratory of Isotope and Hydrochemistry), UNESP, Rio Claro city, Brazil, using an alpha spectrometer (AlphaGuard) adjusted to evaluate the mean radon activity concentration in five cycles of 10 minutes. No radon activity concentration above 100 Bq.L-1, which was a previous critic value established by the World Health Organization. The average radon activity concentration in groundwater was higher than in surface water and in effluent samples, possibly due to the accumulation of uranium and radium in the aquifer layers that favors the radon trapping. The lower value in the river waters can indicate dilution and the intermediate value in the effluents may indicate radon absorption in the coal particles of the reject dumps. The results also indicate that the radon activities in the effluents increase with the sample acidification, possibly due to the higher radium leaching and the subsequent radon transport to the drainage flow. The water samples of Laranjinha River and Ribeirão das Pedras stream, which, respectively, supply Figueira city and receive the mining effluent, exhibited higher pH values upstream the mine, reflecting the acid mine drainage discharge. The radionuclides transport indicates the importance of monitoring their activity concentration in natural waters due to the risks that the radioactivity can represent to human health.

Keywords: radon, radium, acid mine drainage, coal

Procedia PDF Downloads 425
24262 Data Quality as a Pillar of Data-Driven Organizations: Exploring the Benefits of Data Mesh

Authors: Marc Bachelet, Abhijit Kumar Chatterjee, José Manuel Avila

Abstract:

Data quality is a key component of any data-driven organization. Without data quality, organizations cannot effectively make data-driven decisions, which often leads to poor business performance. Therefore, it is important for an organization to ensure that the data they use is of high quality. This is where the concept of data mesh comes in. Data mesh is an organizational and architectural decentralized approach to data management that can help organizations improve the quality of data. The concept of data mesh was first introduced in 2020. Its purpose is to decentralize data ownership, making it easier for domain experts to manage the data. This can help organizations improve data quality by reducing the reliance on centralized data teams and allowing domain experts to take charge of their data. This paper intends to discuss how a set of elements, including data mesh, are tools capable of increasing data quality. One of the key benefits of data mesh is improved metadata management. In a traditional data architecture, metadata management is typically centralized, which can lead to data silos and poor data quality. With data mesh, metadata is managed in a decentralized manner, ensuring accurate and up-to-date metadata, thereby improving data quality. Another benefit of data mesh is the clarification of roles and responsibilities. In a traditional data architecture, data teams are responsible for managing all aspects of data, which can lead to confusion and ambiguity in responsibilities. With data mesh, domain experts are responsible for managing their own data, which can help provide clarity in roles and responsibilities and improve data quality. Additionally, data mesh can also contribute to a new form of organization that is more agile and adaptable. By decentralizing data ownership, organizations can respond more quickly to changes in their business environment, which in turn can help improve overall performance by allowing better insights into business as an effect of better reports and visualization tools. Monitoring and analytics are also important aspects of data quality. With data mesh, monitoring, and analytics are decentralized, allowing domain experts to monitor and analyze their own data. This will help in identifying and addressing data quality problems in quick time, leading to improved data quality. Data culture is another major aspect of data quality. With data mesh, domain experts are encouraged to take ownership of their data, which can help create a data-driven culture within the organization. This can lead to improved data quality and better business outcomes. Finally, the paper explores the contribution of AI in the coming years. AI can help enhance data quality by automating many data-related tasks, like data cleaning and data validation. By integrating AI into data mesh, organizations can further enhance the quality of their data. The concepts mentioned above are illustrated by AEKIDEN experience feedback. AEKIDEN is an international data-driven consultancy that has successfully implemented a data mesh approach. By sharing their experience, AEKIDEN can help other organizations understand the benefits and challenges of implementing data mesh and improving data quality.

Keywords: data culture, data-driven organization, data mesh, data quality for business success

Procedia PDF Downloads 128
24261 Research on Internet Attention of Tourism and Marketing Strategy in Northeast Sichuan Economic Zone in China Based on Baidu Index

Authors: Chuanqiao Zheng, Wei Zeng, Haozhen Lin

Abstract:

As of March 2020, the number of Chinese netizens has reached 904 million. The proportion of Internet users accessing the Internet through mobile phones is as high as 99.3%. Under the background of 'Internet +', tourists have a stronger sense of independence in the choice of tourism destinations and tourism products. Tourists are more inclined to learn about the relevant information on tourism destinations and other tourists' evaluations of tourist products through the Internet. The search engine, as an integrated platform that contains a wealth of information, is highly valuable to the analysis of the characteristics of the Internet attention given to various tourism destinations, through big data mining and analysis. This article uses the Baidu Index as the data source, which is one of the products of Baidu Search. The Baidu Index is based on big data, which collects and shares the search results of a large number of Internet users on the Baidu search engine. The big data used in this article includes search index, demand map, population profile, etc. The main research methods used are: (1) based on the search index, analyzing the Internet attention given to the tourism in five cities in Northeast Sichuan at different times, so as to obtain the overall trend and individual characteristics of tourism development in the region; (2) based on the demand map and the population profile, analyzing the demographic characteristics and market positioning of the tourist groups in these cities to understand the characteristics and needs of the target groups; (3) correlating the Internet attention data with the permanent population of each province in China in the corresponding to construct the Boston matrix of the Internet attention rate of the Northeast Sichuan tourism, obtain the tourism target markets, and then propose development strategies for different markets. The study has found that: a) the Internet attention given to the tourism in the region can be categorized into tourist off-season and peak season; the Internet attention given to tourism in different cities is quite different. b) tourists look for information including tour guide information, ticket information, traffic information, weather information, and information on the competing tourism cities; with regard to the population profile, the main group of potential tourists searching for the keywords of tourism in the five prefecture-level cities in Northeast Sichuan are youth. The male to female ratio is about 6 to 4, with males being predominant. c) through the construction of the Boston matrix, it is concluded that the star market for tourism in the Northeast Sichuan Economic Zone includes Sichuan and Shaanxi; the cash cows market includes Hainan and Ningxia; the question market includes Jiangsu and Shanghai; the dog market includes Hubei and Jiangxi. The study concludes with the following planning strategies and recommendations: i) creating a diversified business format that integrates cultural and tourism; ii) creating a brand image of niche tourism; iii) focusing on the development of tourism products; iv) innovating composite three-dimensional marketing channels.

Keywords: Baidu Index, big data, internet attention, tourism

Procedia PDF Downloads 117
24260 Strategies for Conserving Ecosystem Functions of the Aravalli Range to Combat Land Degradation: Case of Kishangarh and Tijara Tehsil in Rajasthan, India

Authors: Saloni Khandelwal

Abstract:

The Aravalli hills are one of the oldest and most distinctive mountain chains of peninsular India spanning in around 692 Km. More than 60% of it falls in the state of Rajasthan and influences ecological equilibrium in about 30% of the state. Because of natural and human-induced activities, physical gaps in the Aravallis are increasing, new gaps are coming up, and its physical structure is changing. There are no strict regulations to protect and monitor the Aravallis and no comprehensive research and study has been done for the enhancement of ecosystem functions of these ranges. Through this study, various factors leading to Aravalli’s degradation are identified and its impacts on selected areas are analyzed. A literature study is done to identify factors responsible for the degradation. To understand the severity of the problem at the lowest level, two tehsils from different districts in Rajasthan, which are the most affected due to illegal mining and increasing physical gaps are selected for the study. Case-1 of three-gram panchayats in Kishangarh Tehsil of Ajmer district focuses on the expanding physical gaps in the Aravalli range, and case-2 of three-gram panchayats in Tijara Tehsil of Alwar district focuses on increasing illegal mining in the Aravalli range. For measuring the degradation, physical, biological and social indicators are identified through literature review and for both the cases analysis is done on the basis of these indicators. Primary survey and focus group discussions are done with villagers, mining owners, illegal miners, and various government officials to understand dependency of people on the Aravalli and its importance to them along with the impact of degradation on their livelihood and environment. From the analysis, it has been found that green cover is continuously decreasing in both cases, dense forest areas do not exist now, the groundwater table is depleting at a very fast rate, soil is losing its moisture resulting in low yield and shift in agriculture. Wild animals which were easily seen earlier are now extinct. Cattles of villagers are dependent on the forest area in the Aravalli range for food, but with a decrease in fodder, their cattle numbers are decreasing. There is a decrease in agricultural land and an increase in scrub and salt-affected land. Analysis of various national and state programmes, acts which were passed to conserve biodiversity has been done showing that none of them is helping much to protect the Aravalli. For conserving the Aravalli and its forest areas, regional level and local level initiatives are required and are proposed in this study. This study is an attempt to formulate conservation and management strategies for the Aravalli range. These strategies will help in improving biodiversity which can lead to the revival of its ecosystem functions. It will also help in curbing the pollution at the regional and local level. All this will lead to the sustainable development of the region.

Keywords: Aravalli, ecosystem, LULC, Rajasthan

Procedia PDF Downloads 130
24259 Significant Aspects and Drivers of Germany and Australia's Energy Policy from a Political Economy Perspective

Authors: Sarah Niklas, Lynne Chester, Mark Diesendorf

Abstract:

Geopolitical tensions, climate change and recent movements favouring a transformative shift in institutional power structures have influenced the economics of conventional energy supply for decades. This study takes a multi-dimensional approach to illustrate the potential of renewable energy (RE) technology to provide a pathway to a low-carbon economy driven by ecologically sustainable, independent and socially just energy. This comparative analysis identifies economic, political and social drivers that shaped the adoption of RE policy in two significantly different economies, Germany and Australia, with strong and weak commitments to RE respectively. Two complementary political-economy theories frame the document-based analysis. Régulation Theory, inspired by Marxist ideas and strongly influenced by contemporary economic problems, provides the background to explore the social relationships contributing the adoption of RE within the macro-economy. Varieties of Capitalism theory, a more recently developed micro-economic approach, examines the nature of state-firm relationships. Together these approaches provide a comprehensive lens of analysis. Germany’s energy policy transformed substantially over the second half of the last century. The development is characterised by the coordination of societal, environmental and industrial demands throughout the advancement of capitalist regimes. In the Fordist regime, mass production based on coal drove Germany’s astounding economic recovery during the post-war period. Economic depression and the instability of institutional arrangements necessitated the impulsive seeking of national security and energy independence. During the postwar Flexi-Fordist period, quality-based production, innovation and technology-based competition schemes, particularly with regard to political power structures in and across Europe, favoured the adoption of RE. Innovation, knowledge and education were institutionalized, leading to the legislation of environmental concerns. Lastly the establishment of government-industry-based coordinative programs supported the phase out of nuclear power and the increased adoption of RE during the last decade. Australia’s energy policy is shaped by the country’s richness in mineral resources. Energy policy largely served coal mining, historically and currently one of the most capital-intense industry. Assisted by the macro-economic dimensions of institutional arrangements, social and financial capital is orientated towards the export-led and strongly demand-oriented economy. Here energy policy serves the maintenance of capital accumulation in the mining sector and the emerging Asian economies. The adoption of supportive renewable energy policy would challenge the distinct role of the mining industry within the (neo)-liberal market economy. The state’s protective role of the mining sector has resulted in weak commitment to RE policy and investment uncertainty in the energy sector. Recent developments, driven by strong public support for RE, emphasize the sense of community in urban and rural areas and the emergence of a bottom-up approach to adopt renewables. Thus, political economy frameworks on both the macro-economic (Regulation Theory) and micro-economic (Varieties of Capitalism theory) scales can together explain the strong commitment to RE in Germany vis-à-vis the weak commitment in Australia.

Keywords: political economy, regulation theory, renewable energy, social relationships, energy transitions

Procedia PDF Downloads 376
24258 Big Data Analysis with RHadoop

Authors: Ji Eun Shin, Byung Ho Jung, Dong Hoon Lim

Abstract:

It is almost impossible to store or analyze big data increasing exponentially with traditional technologies. Hadoop is a new technology to make that possible. R programming language is by far the most popular statistical tool for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with different sizes of actual data. Experimental results showed our RHadoop system was much faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and big lm packages available on big memory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.

Keywords: big data, Hadoop, parallel regression analysis, R, RHadoop

Procedia PDF Downloads 429
24257 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: data augmentation, mutex task generation, meta-learning, text classification.

Procedia PDF Downloads 89
24256 Efficient Positioning of Data Aggregation Point for Wireless Sensor Network

Authors: Sifat Rahman Ahona, Rifat Tasnim, Naima Hassan

Abstract:

Data aggregation is a helpful technique for reducing the data communication overhead in wireless sensor network. One of the important tasks of data aggregation is positioning of the aggregator points. There are a lot of works done on data aggregation. But, efficient positioning of the aggregators points is not focused so much. In this paper, authors are focusing on the positioning or the placement of the aggregation points in wireless sensor network. Authors proposed an algorithm to select the aggregators positions for a scenario where aggregator nodes are more powerful than sensor nodes.

Keywords: aggregation point, data communication, data aggregation, wireless sensor network

Procedia PDF Downloads 149
24255 Spatial Econometric Approaches for Count Data: An Overview and New Directions

Authors: Paula Simões, Isabel Natário

Abstract:

This paper reviews a number of theoretical aspects for implementing an explicit spatial perspective in econometrics for modelling non-continuous data, in general, and count data, in particular. It provides an overview of the several spatial econometric approaches that are available to model data that are collected with reference to location in space, from the classical spatial econometrics approaches to the recent developments on spatial econometrics to model count data, in a Bayesian hierarchical setting. Considerable attention is paid to the inferential framework, necessary for structural consistent spatial econometric count models, incorporating spatial lag autocorrelation, to the corresponding estimation and testing procedures for different assumptions, to the constrains and implications embedded in the various specifications in the literature. This review combines insights from the classical spatial econometrics literature as well as from hierarchical modeling and analysis of spatial data, in order to look for new possible directions on the processing of count data, in a spatial hierarchical Bayesian econometric context.

Keywords: spatial data analysis, spatial econometrics, Bayesian hierarchical models, count data

Procedia PDF Downloads 586
24254 A NoSQL Based Approach for Real-Time Managing of Robotics's Data

Authors: Gueidi Afef, Gharsellaoui Hamza, Ben Ahmed Samir

Abstract:

This paper deals with the secret of the continual progression data that new data management solutions have been emerged: The NoSQL databases. They crossed several areas like personalization, profile management, big data in real-time, content management, catalog, view of customers, mobile applications, internet of things, digital communication and fraud detection. Nowadays, these database management systems are increasing. These systems store data very well and with the trend of big data, a new challenge’s store demands new structures and methods for managing enterprise data. The new intelligent machine in the e-learning sector, thrives on more data, so smart machines can learn more and faster. The robotics are our use case to focus on our test. The implementation of NoSQL for Robotics wrestle all the data they acquire into usable form because with the ordinary type of robotics; we are facing very big limits to manage and find the exact information in real-time. Our original proposed approach was demonstrated by experimental studies and running example used as a use case.

Keywords: NoSQL databases, database management systems, robotics, big data

Procedia PDF Downloads 346
24253 Effects of pH, Load Capacity and Contact Time in the Sulphate Sorption onto a Functionalized Mesoporous Structure

Authors: Jaime Pizarro, Ximena Castillo

Abstract:

The intensive use of water in agriculture, industry, human consumption and increasing pollution are factors that reduce the availability of water for future generations; the challenge is to advance in sustainable and low-cost solutions to reuse water and to facilitate the availability of the resource in quality and quantity. The use of new low-cost materials with sorbent capacity for pollutants is a solution that contributes to the improvement and expansion of water treatment and reuse systems. Fly ash, a residue from the combustion of coal in power plants that is produced in large quantities in newly industrialized countries, contains a high amount of silicon oxides and aluminum oxides, whose properties can be used for the synthesis of mesoporous materials. Properly functionalized, this material allows obtaining matrixes with high sorption capacity. The mesoporous materials have a large surface area, thermal and mechanical stability, uniform porous structure, and high sorption and functionalization capacities. The goal of this study was to develop hexagonal mesoporous siliceous material (HMS) for the adsorption of sulphate from industrial and mining waters. The silica was extracted from fly ash after calcination at 850 ° C, followed by the addition of water. The mesoporous structure has a surface area of 282 m2 g-1 and a size of 5.7 nm and was functionalized with ethylene diamine through of a self-assembly method. The material was characterized by Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS). The capacity of sulphate sorption was evaluated according to pH, maximum load capacity and contact time. The sulphate maximum adsorption capacity was 146.1 mg g-1, which is three times higher than commercial sorbents. The kinetic data were fitted according to a pseudo-second order model with a high coefficient of linear regression at different initial concentrations. The adsorption isotherm that best fitted the experimental data was the Freundlich model.

Keywords: fly ash, mesoporous siliceous, sorption, sulphate

Procedia PDF Downloads 150
24252 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the model-agnostic meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to an exponential growth of computation, this paper also proposes a key data extraction method that only extract part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: mutex task generation, data augmentation, meta-learning, text classification.

Procedia PDF Downloads 133