Search results for: heterogeneous massive data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25192

Search results for: heterogeneous massive data

24592 Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems

Authors: Mais Haj Qasem, Maen M. Al Assaf, Ali Rodan

Abstract:

Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.

Keywords: hybrid storage system, data mining, recurrent neural network, support vector machine

Procedia PDF Downloads 289
24591 Discussion on Big Data and One of Its Early Training Application

Authors: Fulya Gokalp Yavuz, Mark Daniel Ward

Abstract:

This study focuses on a contemporary and inevitable topic of Data Science and its exemplary application for early career building: Big Data and Leaving Learning Community (LLC). ‘Academia’ and ‘Industry’ have a common sense on the importance of Big Data. However, both of them are in a threat of missing the training on this interdisciplinary area. Some traditional teaching doctrines are far away being effective on Data Science. Practitioners needs some intuition and real-life examples how to apply new methods to data in size of terabytes. We simply explain the scope of Data Science training and exemplified its early stage application with LLC, which is a National Science Foundation (NSF) founded project under the supervision of Prof. Ward since 2014. Essentially, we aim to give some intuition for professors, researchers and practitioners to combine data science tools for comprehensive real-life examples with the guides of mentees’ feedback. As a result of discussing mentoring methods and computational challenges of Big Data, we intend to underline its potential with some more realization.

Keywords: Big Data, computation, mentoring, training

Procedia PDF Downloads 341
24590 Towards a Secure Storage in Cloud Computing

Authors: Mohamed Elkholy, Ahmed Elfatatry

Abstract:

Cloud computing has emerged as a flexible computing paradigm that reshaped the Information Technology map. However, cloud computing brought about a number of security challenges as a result of the physical distribution of computational resources and the limited control that users have over the physical storage. This situation raises many security challenges for data integrity and confidentiality as well as authentication and access control. This work proposes a security mechanism for data integrity that allows a data owner to be aware of any modification that takes place to his data. The data integrity mechanism is integrated with an extended Kerberos authentication that ensures authorized access control. The proposed mechanism protects data confidentiality even if data are stored on an untrusted storage. The proposed mechanism has been evaluated against different types of attacks and proved its efficiency to protect cloud data storage from different malicious attacks.

Keywords: access control, data integrity, data confidentiality, Kerberos authentication, cloud security

Procedia PDF Downloads 313
24589 Analysis of Tactile Perception of Textiles by Fingertip Skin Model

Authors: Izabela L. Ciesielska-Wrόbel

Abstract:

This paper presents finite element models of the fingertip skin which have been created to simulate the contact of textile objects with the skin to gain a better understanding of the perception of textiles through the skin, so-called Hand of Textiles (HoT). Many objective and subjective techniques have been developed to analyze HoT, however none of them provide exact overall information concerning the sensation of textiles through the skin. As the human skin is a complex heterogeneous hyperelastic body composed of many particles, some simplifications had to be made at the stage of building the models. The same concerns models of woven structures, however their utilitarian value was maintained. The models reflect only friction between skin and woven textiles, deformation of the skin and fabrics when “touching” textiles and heat transfer from the surface of the skin into direction of textiles.

Keywords: fingertip skin models, finite element models, modelling of textiles, sensation of textiles through the skin

Procedia PDF Downloads 450
24588 Ontological Modeling Approach for Statistical Databases Publication in Linked Open Data

Authors: Bourama Mane, Ibrahima Fall, Mamadou Samba Camara, Alassane Bah

Abstract:

At the level of the National Statistical Institutes, there is a large volume of data which is generally in a format which conditions the method of publication of the information they contain. Each household or business data collection project includes a dissemination platform for its implementation. Thus, these dissemination methods previously used, do not promote rapid access to information and especially does not offer the option of being able to link data for in-depth processing. In this paper, we present an approach to modeling these data to publish them in a format intended for the Semantic Web. Our objective is to be able to publish all this data in a single platform and offer the option to link with other external data sources. An application of the approach will be made on data from major national surveys such as the one on employment, poverty, child labor and the general census of the population of Senegal.

Keywords: Semantic Web, linked open data, database, statistic

Procedia PDF Downloads 162
24587 Improved Mechanical and Electrical Properties and Thermal Stability of Post-Consumer Polyethylene Terephthalate Glycol Containing Hybrid System of Nanofillers

Authors: Iman Taraghi, Sandra Paszkiewicz, Daria Pawlikowska, Anna Szymczyk, Izabela Irska, Rafal Stanik, Amelia Linares, Tiberio A. Ezquerra, Elżbieta Piesowicz

Abstract:

Currently, the massive use of thermoplastic materials in industrial applications causes huge amounts of polymer waste. The poly (ethylene glycol-co-1,4-cyclohexanedimethanol terephthalate) (PET-G) has been widely used in food packaging and polymer foils. In this research, the PET-G foils have been recycled and reused as a matrix to combine with different types of nanofillers such as carbon nanotubes, graphene nanoplatelets, and nanosized carbon black. The mechanical and electrical properties, as well as thermal stability and thermal conductivity of the PET-G, improved along with the addition of the aforementioned nanofillers and hybrid system of them.

Keywords: polymer hybrid nanocomposites, carbon nanofillers, recycling, physical performance

Procedia PDF Downloads 101
24586 The Role of Data Protection Officer in Managing Individual Data: Issues and Challenges

Authors: Nazura Abdul Manap, Siti Nur Farah Atiqah Salleh

Abstract:

For decades, the misuse of personal data has been a critical issue. Malaysia has accepted responsibility by implementing the Malaysian Personal Data Protection Act 2010 to secure personal data (PDPA 2010). After more than a decade, this legislation is set to be revised by the current PDPA 2023 Amendment Bill to align with the world's key personal data protection regulations, such as the European Union General Data Protection Regulations (GDPR). Among the other suggested adjustments is the Data User's appointment of a Data Protection Officer (DPO) to ensure the commercial entity's compliance with the PDPA 2010 criteria. The change is expected to be enacted in parliament fairly soon; nevertheless, based on the experience of the Personal Data Protection Department (PDPD) in implementing the Act, it is projected that there will be a slew of additional concerns associated with the DPO mandate. Consequently, the goal of this article is to highlight the issues that the DPO will encounter and how the Personal Data Protection Department should respond to this subject. The study result was produced using a qualitative technique based on an examination of the current literature. This research reveals that there are probable obstacles experienced by the DPO, and thus, there should be a definite, clear guideline in place to aid DPO in executing their tasks. It is argued that appointing a DPO is a wise measure in ensuring that the legal data security requirements are met.

Keywords: guideline, law, data protection officer, personal data

Procedia PDF Downloads 60
24585 AG Loaded WO3 Nanoplates for Photocatalytic Degradation of Sulfanilamide and Bacterial Removal under Visible Light

Authors: W. Y. Zhu, X. L. Yan, Y. Zhou

Abstract:

Sulfonamides (SAs) are extensively used antibiotics; photocatalysis is an effective, way to remove the SAs from water driven by solar energy. Here we used WO3 nanoplates and their Ag heterogeneous as photocatalysts to investigate their photodegradation efficiency against sulfanilamide (SAM) which is the precursor of SAs. Results showed that WO3/Ag composites performed much better than pure WO3 where the highest removal rate was 96.2% can be achieved under visible light irradiation. Ag as excellent antibacterial agent also endows certain antibacterial efficiency to WO3, and 100% removal efficiency could be achieved in 2 h under visible light irradiation for all WO3/Ag composites. Generally, WO3/Ag composites are very effective photocatalysts with potentials in practical applications which mainly use cheap, clean and green solar energy as energy source.

Keywords: antibacterial, photocatalysis, semiconductor, sulfanilamide

Procedia PDF Downloads 340
24584 Airport Investment Risk Assessment under Uncertainty

Authors: Elena M. Capitanul, Carlos A. Nunes Cosenza, Walid El Moudani, Felix Mora Camino

Abstract:

The construction of a new airport or the extension of an existing one requires massive investments and many times public private partnerships were considered in order to make feasible such projects. One characteristic of these projects is uncertainty with respect to financial and environmental impacts on the medium to long term. Another one is the multistage nature of these types of projects. While many airport development projects have been a success, some others have turned into a nightmare for their promoters. This communication puts forward a new approach for airport investment risk assessment. The approach takes explicitly into account the degree of uncertainty in activity levels prediction and proposes milestones for the different stages of the project for minimizing risk. Uncertainty is represented through fuzzy dual theory and risk management is performed using dynamic programming. An illustration of the proposed approach is provided.

Keywords: airports, fuzzy logic, risk, uncertainty

Procedia PDF Downloads 386
24583 Data Collection Based on the Questionnaire Survey In-Hospital Emergencies

Authors: Nouha Mhimdi, Wahiba Ben Abdessalem Karaa, Henda Ben Ghezala

Abstract:

The methods identified in data collection are diverse: electronic media, focus group interviews and short-answer questionnaires [1]. The collection of poor-quality data resulting, for example, from poorly designed questionnaires, the absence of good translators or interpreters, and the incorrect recording of data allow conclusions to be drawn that are not supported by the data or to focus only on the average effect of the program or policy. There are several solutions to avoid or minimize the most frequent errors, including obtaining expert advice on the design or adaptation of data collection instruments; or use technologies allowing better "anonymity" in the responses [2]. In this context, we opted to collect good quality data by doing a sizeable questionnaire-based survey on hospital emergencies to improve emergency services and alleviate the problems encountered. At the level of this paper, we will present our study, and we will detail the steps followed to achieve the collection of relevant, consistent and practical data.

Keywords: data collection, survey, questionnaire, database, data analysis, hospital emergencies

Procedia PDF Downloads 88
24582 Habits: Theoretical Foundations and a Conceptual Framework on a Managerial Trap and Chance

Authors: K. Piórkowska

Abstract:

The overarching aim of the paper is to incorporate the micro-foundations perspective in strategic management and offering possibilities to bridge the macro–micro divide, to review the concept of habits, as well as to propose research findings and directions in terms of further exploring the habit construct and its impact on higher epistemological level phenomena (for instance organizational routines, which is a domain inherently multilevel in nature). To realize this aim, the following sections have been developed: (1) habits’ origins, (2) habits – cognitive constellations, (3) interrelationships between habits and mental representations, intentions, (4) habits and organizational routines, and (5) habits and routines linkages with adaptation. The conclusions that have been made support recent and current studies linking the level of individual heterogeneous agents with the level of macro (organizational) outcomes.

Keywords: behaviorism, habits, micro-foundations, routines

Procedia PDF Downloads 241
24581 Federated Learning in Healthcare

Authors: Ananya Gangavarapu

Abstract:

Convolutional Neural Networks (CNN) based models are providing diagnostic capabilities on par with the medical specialists in many specialty areas. However, collecting the medical data for training purposes is very challenging because of the increased regulations around data collections and privacy concerns around personal health data. The gathering of the data becomes even more difficult if the capture devices are edge-based mobile devices (like smartphones) with feeble wireless connectivity in rural/remote areas. In this paper, I would like to highlight Federated Learning approach to mitigate data privacy and security issues.

Keywords: deep learning in healthcare, data privacy, federated learning, training in distributed environment

Procedia PDF Downloads 121
24580 Calculation of Pressure-Varying Langmuir and Brunauer-Emmett-Teller Isotherm Adsorption Parameters

Authors: Trevor C. Brown, David J. Miron

Abstract:

Gas-solid physical adsorption methods are central to the characterization and optimization of the effective surface area, pore size and porosity for applications such as heterogeneous catalysis, and gas separation and storage. Properties such as adsorption uptake, capacity, equilibrium constants and Gibbs free energy are dependent on the composition and structure of both the gas and the adsorbent. However, challenges remain, in accurately calculating these properties from experimental data. Gas adsorption experiments involve measuring the amounts of gas adsorbed over a range of pressures under isothermal conditions. Various constant-parameter models, such as Langmuir and Brunauer-Emmett-Teller (BET) theories are used to provide information on adsorbate and adsorbent properties from the isotherm data. These models typically do not provide accurate interpretations across the full range of pressures and temperatures. The Langmuir adsorption isotherm is a simple approximation for modelling equilibrium adsorption data and has been effective in estimating surface areas and catalytic rate laws, particularly for high surface area solids. The Langmuir isotherm assumes the systematic filling of identical adsorption sites to a monolayer coverage. The BET model is based on the Langmuir isotherm and allows for the formation of multiple layers. These additional layers do not interact with the first layer and the energetics are equal to the adsorbate as a bulk liquid. This BET method is widely used to measure the specific surface area of materials. Both Langmuir and BET models assume that the affinity of the gas for all adsorption sites are identical and so the calculated adsorbent uptake at the monolayer and equilibrium constant are independent of coverage and pressure. Accurate representations of adsorption data have been achieved by extending the Langmuir and BET models to include pressure-varying uptake capacities and equilibrium constants. These parameters are determined using a novel regression technique called flexible least squares for time-varying linear regression. For isothermal adsorption the adsorption parameters are assumed to vary slowly and smoothly with increasing pressure. The flexible least squares for pressure-varying linear regression (FLS-PVLR) approach assumes two distinct types of discrepancy terms, dynamic and measurement for all parameters in the linear equation used to simulate the data. Dynamic terms account for pressure variation in successive parameter vectors, and measurement terms account for differences between observed and theoretically predicted outcomes via linear regression. The resultant pressure-varying parameters are optimized by minimizing both dynamic and measurement residual squared errors. Validation of this methodology has been achieved by simulating adsorption data for n-butane and isobutane on activated carbon at 298 K, 323 K and 348 K and for nitrogen on mesoporous alumina at 77 K with pressure-varying Langmuir and BET adsorption parameters (equilibrium constants and uptake capacities). This modeling provides information on the adsorbent (accessible surface area and micropore volume), adsorbate (molecular areas and volumes) and thermodynamic (Gibbs free energies) variations of the adsorption sites.

Keywords: Langmuir adsorption isotherm, BET adsorption isotherm, pressure-varying adsorption parameters, adsorbate and adsorbent properties and energetics

Procedia PDF Downloads 209
24579 Metal Contaminants in River Water and Human Urine after an Episode of Major Pollution by Mining Wastes in the Kasai Province of DR Congo

Authors: Remy Mpulumba Badiambile, Paul Musa Obadia, Malick Useni Mutayo, Jeef Numbi Mukanya, Patient Nkulu Banza, Tony Kayembe Kitenge, Erik Smolders, Jean-François Picron, Vincent Haufroid, Célestin Banza Lubaba Nkulu, Benoit Nemery

Abstract:

Background: In July 2021, the Tshikapa river became heavily polluted by mining wastes from a diamond mine in neighboring Angola, leading to massive killing of fish, as well as disease and even deaths among residents living along the Tshikapa and Kasai rivers, a major contributory of the Congo river. The exact nature of the pollutants was unknown. Methods: In a cross-sectional study conducted in the city of Tshikapa in August 2021, we enrolled by opportunistic sampling 65 residents (11 children < 16y) living alongside the polluted rivers and 65 control residents (5 children) living alongside a non-affected portion of the Kasai river (upstream from the Tshikapa-Kasai confluence). We administered a questionnaire and obtained spot urine samples for measurements of thiocyanate (a metabolite of cyanide) and 26 trace metals (by ICP-MS). Metals (and pH) were also measured in samples of river water. Results: Participants from both groups consumed river water. In the area affected by the pollution, most participants had eaten dead fish. Prevalences of reported health symptoms were higher in the exposed group than among controls: skin rashes (52% vs 0%), diarrhea (40% vs 8%), abdominal pain (8% vs 3%), nausea (3% vs 0%). In polluted water, concentrations [median (range)] were only higher for nickel [(2.2(1.4–3.5)µg/L] and uranium [78(71–91)ng/L] than in non-polluted water [0.8(0.6–1.9)µg/L; 9(7–19)ng/L]. In urine, concentrations [µg/g creatinine, median(IQR)] were significantly higher in the exposed group than in controls for lithium [19.5(12.4–27.3) vs 6.9(5.9–12.1)], thallium [0.41(0.31–0.57) vs 0.19(0.16–0.39)], and uranium [0.026(0.013–0.037)] vs 0.012(0.006–0.024)]. Other elements did not differ between the groups, but levels were higher than reference values for several metals (including manganese, cobalt, nickel, and lead). Urinary thiocyanate concentrations did not differ. Conclusion: This study, after an ecological disaster in the DRC, has documented contamination of river water by nickel and uranium and high urinary levels of some trace metals among affected riverine populations. However, the exact cause of the massive fish kill and disease among residents remains elusive. The capacity to rapidly investigate toxic pollution events must be increased in the area.

Keywords: metal contaminants, river water and human urine, pollution by mining wastes, DR Congo

Procedia PDF Downloads 125
24578 The Utilization of Big Data in Knowledge Management Creation

Authors: Daniel Brian Thompson, Subarmaniam Kannan

Abstract:

The huge weightage of knowledge in this world and within the repository of organizations has already reached immense capacity and is constantly increasing as time goes by. To accommodate these constraints, Big Data implementation and algorithms are utilized to obtain new or enhanced knowledge for decision-making. With the transition from data to knowledge provides the transformational changes which will provide tangible benefits to the individual implementing these practices. Today, various organization would derive knowledge from observations and intuitions where this information or data will be translated into best practices for knowledge acquisition, generation and sharing. Through the widespread usage of Big Data, the main intention is to provide information that has been cleaned and analyzed to nurture tangible insights for an organization to apply to their knowledge-creation practices based on facts and figures. The translation of data into knowledge will generate value for an organization to make decisive decisions to proceed with the transition of best practices. Without a strong foundation of knowledge and Big Data, businesses are not able to grow and be enhanced within the competitive environment.

Keywords: big data, knowledge management, data driven, knowledge creation

Procedia PDF Downloads 91
24577 Congruency of English Teachers’ Assessments Vis-à-Vis 21st Century Skills Assessment Standards

Authors: Mary Jane Suarez

Abstract:

A massive educational overhaul has taken place at the onset of the 21st century addressing the mismatches of employability skills with that of scholastic skills taught in schools. For a community to thrive in an ever-developing economy, the teaching of the necessary skills for job competencies should be realized by every educational institution. However, in harnessing 21st-century skills amongst learners, teachers, who often lack familiarity and thorough insights into the emerging 21st-century skills, are chained with the restraint of the need to comprehend the physiognomies of 21st-century skills learning and the requisite to implement the tenets of 21st-century skills teaching. With the endeavor to espouse 21st-century skills learning and teaching, a United States-based national coalition called Partnership 21st Century Skills (P21) has identified the four most important skills in 21st-century learning: critical thinking, communication, collaboration, and creativity and innovation with an established framework for 21st-century skills standards. Assessment of skills is the lifeblood of every teaching and learning encounter. It is correspondingly crucial to look at the 21st century standards and the assessment guides recognized by P21 to ensure that learners are 21st century ready. This mixed-method study sought to discover and describe what classroom assessments were used by English teachers in a public secondary school in the Philippines with course offerings on science, technology, engineering, and mathematics (STEM). The research evaluated the assessment tools implemented by English teachers and how these assessment tools were congruent to the 21st assessment standards of P21. A convergent parallel design was used to analyze assessment tools and practices in four phases. In the data-gathering phase, survey questionnaires, document reviews, interviews, and classroom observations were used to gather quantitative and qualitative data simultaneously, and how assessment tools and practices were consistent with the P21 framework with the four Cs as its foci. In the analysis phase, the data were treated using mean, frequency, and percentage. In the merging and interpretation phases, a side-by-side comparison was used to identify convergent and divergent aspects of the results. In conclusion, the results yielded assessments tools and practices that were inconsistent, if not at all, used by teachers. Findings showed that there were inconsistencies in implementing authentic assessments, there was a scarcity of using a rubric to critically assess 21st skills in both language and literature subjects, there were incongruencies in using portfolio and self-reflective assessments, there was an exclusion of intercultural aspects in assessing the four Cs and the lack of integrating collaboration in formative and summative assessments. As a recommendation, a harmonized assessment scheme of P21 skills was fashioned for teachers to plan, implement, and monitor classroom assessments of 21st-century skills, ensuring the alignment of such assessments to P21 standards for the furtherance of the institution’s thrust to effectively integrate 21st-century skills assessment standards to its curricula.

Keywords: 21st-century skills, 21st-century skills assessments, assessment standards, congruency, four Cs

Procedia PDF Downloads 170
24576 Survey on Data Security Issues Through Cloud Computing Amongst Sme’s in Nairobi County, Kenya

Authors: Masese Chuma Benard, Martin Onsiro Ronald

Abstract:

Businesses have been using cloud computing more frequently recently because they wish to take advantage of its advantages. However, employing cloud computing also introduces new security concerns, particularly with regard to data security, potential risks and weaknesses that could be exploited by attackers, and various tactics and strategies that could be used to lessen these risks. This study examines data security issues on cloud computing amongst sme’s in Nairobi county, Kenya. The study used the sample size of 48, the research approach was mixed methods, The findings show that data owner has no control over the cloud merchant's data management procedures, there is no way to ensure that data is handled legally. This implies that you will lose control over the data stored in the cloud. Data and information stored in the cloud may face a range of availability issues due to internet outages; this can represent a significant risk to data kept in shared clouds. Integrity, availability, and secrecy are all mentioned.

Keywords: data security, cloud computing, information, information security, small and medium-sized firms (SMEs)

Procedia PDF Downloads 64
24575 Cloud Design for Storing Large Amount of Data

Authors: M. Strémy, P. Závacký, P. Cuninka, M. Juhás

Abstract:

Main goal of this paper is to introduce our design of private cloud for storing large amount of data, especially pictures, and to provide good technological backend for data analysis based on parallel processing and business intelligence. We have tested hypervisors, cloud management tools, storage for storing all data and Hadoop to provide data analysis on unstructured data. Providing high availability, virtual network management, logical separation of projects and also rapid deployment of physical servers to our environment was also needed.

Keywords: cloud, glusterfs, hadoop, juju, kvm, maas, openstack, virtualization

Procedia PDF Downloads 338
24574 Estimation of Missing Values in Aggregate Level Spatial Data

Authors: Amitha Puranik, V. S. Binu, Seena Biju

Abstract:

Missing data is a common problem in spatial analysis especially at the aggregate level. Missing can either occur in covariate or in response variable or in both in a given location. Many missing data techniques are available to estimate the missing data values but not all of these methods can be applied on spatial data since the data are autocorrelated. Hence there is a need to develop a method that estimates the missing values in both response variable and covariates in spatial data by taking account of the spatial autocorrelation. The present study aims to develop a model to estimate the missing data points at the aggregate level in spatial data by accounting for (a) Spatial autocorrelation of the response variable (b) Spatial autocorrelation of covariates and (c) Correlation between covariates and the response variable. Estimating the missing values of spatial data requires a model that explicitly account for the spatial autocorrelation. The proposed model not only accounts for spatial autocorrelation but also utilizes the correlation that exists between covariates, within covariates and between a response variable and covariates. The precise estimation of the missing data points in spatial data will result in an increased precision of the estimated effects of independent variables on the response variable in spatial regression analysis.

Keywords: spatial regression, missing data estimation, spatial autocorrelation, simulation analysis

Procedia PDF Downloads 357
24573 A Survey on Protests Against Compulsory Hejab in Iran From Iranian Women’s Point of View After Mahsa Amini`S Death: A Grounded Theory Approach

Authors: Shirin Arefi

Abstract:

In Iran, women and girls are treated as second class citizens and suffer from many discrimination and inequality such as compulsory Hejab, a phenomena which has required all women to wear the hijab head-covering since the 1979 Islamic revolution. Now, the crackdown of new government has caused a massive uproar in the country. The morality police also continue to curb the choices of women, and the latest unfortunate incidents accelerate the hardened rules. The author is going to survey the views and of women against compulsory Hejab and morality and chastity police arrests. The methodology is a qualitative one in which narratives of them are coded based on grounded theory and horizons of the process is explained by phenomenological research as well. The findings and results will show the current attitudes of women of Hejab and their reactions against morality police behaviors.

Keywords: compulsory hejab, morality police, people, arrest

Procedia PDF Downloads 87
24572 Association Rules Mining and NOSQL Oriented Document in Big Data

Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub

Abstract:

Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.

Keywords: Apriori, Association rules mining, Big Data, Data Mining, Hadoop, MapReduce, MongoDB, NoSQL

Procedia PDF Downloads 140
24571 Immunization-Data-Quality in Public Health Facilities in the Pastoralist Communities: A Comparative Study Evidence from Afar and Somali Regional States, Ethiopia

Authors: Melaku Tsehay

Abstract:

The Consortium of Christian Relief and Development Associations (CCRDA), and the CORE Group Polio Partners (CGPP) Secretariat have been working with Global Alliance for Vac-cines and Immunization (GAVI) to improve the immunization data quality in Afar and Somali Regional States. The main aim of this study was to compare the quality of immunization data before and after the above interventions in health facilities in the pastoralist communities in Ethiopia. To this end, a comparative-cross-sectional study was conducted on 51 health facilities. The baseline data was collected in May 2019, while the end line data in August 2021. The WHO data quality self-assessment tool (DQS) was used to collect data. A significant improvment was seen in the accuracy of the pentavalent vaccine (PT)1 (p = 0.012) data at the health posts (HP), while PT3 (p = 0.010), and Measles (p = 0.020) at the health centers (HC). Besides, a highly sig-nificant improvment was observed in the accuracy of tetanus toxoid (TT)2 data at HP (p < 0.001). The level of over- or under-reporting was found to be < 8%, at the HP, and < 10% at the HC for PT3. The data completeness was also increased from 72.09% to 88.89% at the HC. Nearly 74% of the health facilities timely reported their respective immunization data, which is much better than the baseline (7.1%) (p < 0.001). These findings may provide some hints for the policies and pro-grams targetting on improving immunization data qaulity in the pastoralist communities.

Keywords: data quality, immunization, verification factor, pastoralist region

Procedia PDF Downloads 76
24570 Roof Material Detection Based on Object-Based Approach Using WorldView-2 Satellite Imagery

Authors: Ebrahim Taherzadeh, Helmi Z. M. Shafri, Kaveh Shahi

Abstract:

One of the most important tasks in urban area remote sensing is detection of impervious surface (IS), such as building roof and roads. However, detection of IS in heterogeneous areas still remains as one of the most challenging works. In this study, detection of concrete roof using an object-oriented approach was proposed. A new rule-based classification was developed to detect concrete roof tile. The proposed rule-based classification was applied to WorldView-2 image. Results showed that the proposed rule has good potential to predict concrete roof material from WorldView-2 images with 85% accuracy.

Keywords: object-based, roof material, concrete tile, WorldView-2

Procedia PDF Downloads 409
24569 RA-Apriori: An Efficient and Faster MapReduce-Based Algorithm for Frequent Itemset Mining on Apache Flink

Authors: Sanjay Rathee, Arti Kashyap

Abstract:

Extraction of useful information from large datasets is one of the most important research problems. Association rule mining is one of the best methods for this purpose. Finding possible associations between items in large transaction based datasets (finding frequent patterns) is most important part of the association rule mining. There exist many algorithms to find frequent patterns but Apriori algorithm always remains a preferred choice due to its ease of implementation and natural tendency to be parallelized. Many single-machine based Apriori variants exist but massive amount of data available these days is above capacity of a single machine. Therefore, to meet the demands of this ever-growing huge data, there is a need of multiple machines based Apriori algorithm. For these types of distributed applications, MapReduce is a popular fault-tolerant framework. Hadoop is one of the best open-source software frameworks with MapReduce approach for distributed storage and distributed processing of huge datasets using clusters built from commodity hardware. However, heavy disk I/O operation at each iteration of a highly iterative algorithm like Apriori makes Hadoop inefficient. A number of MapReduce-based platforms are being developed for parallel computing in recent years. Among them, two platforms, namely, Spark and Flink have attracted a lot of attention because of their inbuilt support to distributed computations. Earlier we proposed a reduced- Apriori algorithm on Spark platform which outperforms parallel Apriori, one because of use of Spark and secondly because of the improvement we proposed in standard Apriori. Therefore, this work is a natural sequel of our work and targets on implementing, testing and benchmarking Apriori and Reduced-Apriori and our new algorithm ReducedAll-Apriori on Apache Flink and compares it with Spark implementation. Flink, a streaming dataflow engine, overcomes disk I/O bottlenecks in MapReduce, providing an ideal platform for distributed Apriori. Flink's pipelining based structure allows starting a next iteration as soon as partial results of earlier iteration are available. Therefore, there is no need to wait for all reducers result to start a next iteration. We conduct in-depth experiments to gain insight into the effectiveness, efficiency and scalability of the Apriori and RA-Apriori algorithm on Flink.

Keywords: apriori, apache flink, Mapreduce, spark, Hadoop, R-Apriori, frequent itemset mining

Procedia PDF Downloads 270
24568 Attribution of Strategic Motive, Business Efficiencies, Firm Economies, and Market Factors as Motivations of Restaurant Industry Vertical Integration Adoption: A Structural Equation Model

Authors: Sy, Melecio Jr

Abstract:

The decision to adopt vertical integration (VI) is firm-specific, but there is a common practice among businesses in an industry to maximize the massive potential benefits of VI. This study aims to determine VI adoption in the restaurant industry in Davao City. Using a two-step sampling process, the study used a validated survey questionnaire among 264 restaurant owners and managers randomly selected and geographically classified. It is a quantitative study where the data were subjected to a structural equation model (SEM). The results revealed that VI is present but limited to procurement, production, restaurant services, and online marketing. Raw materials were outsourced while delivery to customers through third-party delivery services. VI slowly increased over ten years except for online marketing, which has grown significantly in a few years. The endogenous and exogenous variables were correlated and established the linear regression model. The SEM's best fit model revealed that strategic motives (SMOT) and market factors (MFAC) influenced VI adoption while MFAC is the best predictor. Favorable market factors may lead restaurants to adopt VI. It is, thus, recommended for restaurants to institutionalize strategic management, quantify the impact of double marginalization in future studies as a reason for VI and conduct this study during the new normal to see the influence of business efficiencies and firm economies on VI adoption.

Keywords: business efficiencies, business management, davao city, firm economies, market factors, philippines, strategic motives, structural equation model, supply chain, vertical integration adoption

Procedia PDF Downloads 54
24567 Geochemical Composition of Deep and Highly Weathered Soils Leyte and Samar Islands Philippines

Authors: Snowie Jane Galgo, Victor Asio

Abstract:

Geochemical composition of soils provides vital information about their origin and development. Highly weathered soils are widespread in the islands of Leyte and Samar but limited data have been published in terms of their nature, characteristics and nutrient status. This study evaluated the total elemental composition, properties and nutrient status of eight (8) deep and highly weathered soils in various parts of Leyte and Samar. Sampling was done down to 3 to 4 meters deep. Total amounts of Al₂O₃, As₂O₃, CaO, CdO, Cr₂O₃, CuO, Fe₂O₃, K₂O, MgO, MnO, Na₂O, NiO, P₂O₅, PbO, SO₃, SiO₂, TiO₂, ZnO and ZrO₂ were analyzed using an X-ray analytical microscope for eight soil profiles. Most of the deep and highly weathered soils have probably developed from homogenous parent materials based on the regular distribution with depth of TiO₂ and ZrO₂. Two of the soils indicated high variability with depth of TiO₂ and ZrO₂ suggesting that these soils developed from heterogeneous parent material. Most soils have K₂O and CaO values below those of MgO and Na₂O. This suggests more losses of K₂O and CaO have occurred since they are more mobile in the weathering environment. Most of the soils contain low amounts of other elements such as CuO, ZnO, PbO, NiO, CrO and SO₂. Basic elements such as K₂O and CaO are more mobile in the weathering environment than MgO and Na₂O resulting in higher losses of the former than the latter. Other elements also show small amounts in all soil profile. Thus, this study is very useful for sustainable crop production and environmental conservation in the study area specifically for highly weathered soils which are widespread in the Philippines.

Keywords: depth function, geochemical composition, highly weathered soils, total elemental composition

Procedia PDF Downloads 241
24566 Identifying Critical Success Factors for Data Quality Management through a Delphi Study

Authors: Maria Paula Santos, Ana Lucas

Abstract:

Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.

Keywords: critical success factors, data quality, data quality management, Delphi, Q-Sort

Procedia PDF Downloads 194
24565 Yarkovsky Effect on the Orbital Dynamics of the Asteroid (101955) Bennu

Authors: Sanjay Narayan Deo, Badam Singh Kushvah

Abstract:

Bennu(101955) is a half kilometer potentially hazardous near-Earth asteroid. We analyze the influence of Yarkovsky effect and relativistic effect of the Sun on the motion of the asteroid Bennu. The transverse model is used to compute Yarkovsky force on asteroid Bennu. Our dynamical model includes Newtonian perturbations of eight planets, the Moon, the Sun and three massive asteroid (1Ceres, 2Palas and 4Vesta). We showed the variation in orbital elements of nominal orbit of the asteroid. In the presence of Yarkovsky effect, the Semi-major axis of the orbit of the asteroid is decreases by 350 m over one period of orbital motion. The magnitude of Yarkovsky force is computed. We find that maximum magnitude of Yarkovsky force is 0.09 N at the perihelion . We also found that the magnitude of the Sun relativity effect is greater than the Yarkovsky effect on the motion the asteroid Bennu.

Keywords: Bennu, orbital elements, relativistic effect, Yarkovsky effect

Procedia PDF Downloads 275
24564 Digital Transformation as the Subject of the Knowledge Model of the Discursive Space

Authors: Rafal Maciag

Abstract:

Due to the development of the current civilization, one must create suitable models of its pervasive massive phenomena. Such a phenomenon is the digital transformation, which has a substantial number of disciplined, methodical interpretations forming the diversified reflection. This reflection could be understood pragmatically as the current temporal, a local differential state of knowledge. The model of the discursive space is proposed as a model for the analysis and description of this knowledge. Discursive space is understood as an autonomous multidimensional space where separate discourses traverse specific trajectories of what can be presented in multidimensional parallel coordinate system. Discursive space built on the world of facts preserves the complex character of that world. Digital transformation as a discursive space has a relativistic character that means that at the same time, it is created by the dynamic discourses and these discourses are molded by the shape of this space.

Keywords: complexity, digital transformation, discourse, discursive space, knowledge

Procedia PDF Downloads 176
24563 Matrix Completion with Heterogeneous Cost

Authors: Ilqar Ramazanli

Abstract:

The matrix completion problem has been studied broadly under many underlying conditions. The problem has been explored under adaptive or non-adaptive, exact or estimation, single-phase or multi-phase, and many other categories. In most of these cases, the observation cost of each entry is uniform and has the same cost across the columns. However, in many real-life scenarios, we could expect elements from distinct columns or distinct positions to have a different cost. In this paper, we explore this generalization under adaptive conditions. We approach the problem under two different cost models. The first one is that entries from different columns have different observation costs, but within the same column, each entry has a uniform cost. The second one is any two entry has different observation cost, despite being the same or different columns. We provide complexity analysis of our algorithms and provide tightness guarantees.

Keywords: matroid optimization, matrix completion, linear algebra, algorithms

Procedia PDF Downloads 83