Search results for: data stream
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24862

Search results for: data stream

24442 Association Rules Mining and NOSQL Oriented Document in Big Data

Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub

Abstract:

Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.

Keywords: Apriori, Association rules mining, Big Data, Data Mining, Hadoop, MapReduce, MongoDB, NoSQL

Procedia PDF Downloads 143
24441 Mathematical Model That Using Scrambling and Message Integrity Methods in Audio Steganography

Authors: Mohammed Salem Atoum

Abstract:

The success of audio steganography is to ensure imperceptibility of the embedded message in stego file and withstand any form of intentional or un-intentional degradation of message (robustness). Audio steganographic that utilized LSB of audio stream to embed message gain a lot of popularity over the years in meeting the perceptual transparency, robustness and capacity. This research proposes an XLSB technique in order to circumvent the weakness observed in LSB technique. Scrambling technique is introduce in two steps; partitioning the message into blocks followed by permutation each blocks in order to confuse the contents of the message. The message is embedded in the MP3 audio sample. After extracting the message, the permutation codebook is used to re-order it into its original form. Md5sum and SHA-256 are used to verify whether the message is altered or not during transmission. Experimental result shows that the XLSB performs better than LSB.

Keywords: XLSB, scrambling, audio steganography, security

Procedia PDF Downloads 349
24440 Immunization-Data-Quality in Public Health Facilities in the Pastoralist Communities: A Comparative Study Evidence from Afar and Somali Regional States, Ethiopia

Authors: Melaku Tsehay

Abstract:

The Consortium of Christian Relief and Development Associations (CCRDA), and the CORE Group Polio Partners (CGPP) Secretariat have been working with Global Alliance for Vac-cines and Immunization (GAVI) to improve the immunization data quality in Afar and Somali Regional States. The main aim of this study was to compare the quality of immunization data before and after the above interventions in health facilities in the pastoralist communities in Ethiopia. To this end, a comparative-cross-sectional study was conducted on 51 health facilities. The baseline data was collected in May 2019, while the end line data in August 2021. The WHO data quality self-assessment tool (DQS) was used to collect data. A significant improvment was seen in the accuracy of the pentavalent vaccine (PT)1 (p = 0.012) data at the health posts (HP), while PT3 (p = 0.010), and Measles (p = 0.020) at the health centers (HC). Besides, a highly sig-nificant improvment was observed in the accuracy of tetanus toxoid (TT)2 data at HP (p < 0.001). The level of over- or under-reporting was found to be < 8%, at the HP, and < 10% at the HC for PT3. The data completeness was also increased from 72.09% to 88.89% at the HC. Nearly 74% of the health facilities timely reported their respective immunization data, which is much better than the baseline (7.1%) (p < 0.001). These findings may provide some hints for the policies and pro-grams targetting on improving immunization data qaulity in the pastoralist communities.

Keywords: data quality, immunization, verification factor, pastoralist region

Procedia PDF Downloads 78
24439 Social Innovation Rediscovered: An Analysis of Empirical Research

Authors: Imen Douzi, Karim Ben Kahla

Abstract:

In spite of the growing attention for social innovation, it is still considered to be in a stage of infancy with minimal progress in theory development. Upon examining the field of study, one would have to conclude that, over the past two decades, academic research has focused primarily on establishing a conceptual foundation. This has resulted in a considerable stream of conceptual papers which have outnumbered empirical articles. Nevertheless, despite its growing popularity, scholars and practitioners are far from reaching a consensus as to what social innovation actually means which resulted in competing definitions and approaches within the field of social innovation and lack of unifying conceptual framework. This paper reviews empirical research studies on social innovation, classifies them along three dimensions and summarizes research findings for each of these dimensions. Preliminary to the analysis of empirical researches, an overview of different perspectives of social innovation is presented.

Keywords: analysis of empirical research, definition, empirical research, social innovation perspectives

Procedia PDF Downloads 364
24438 Current Status of Nitrogen Saturation in the Upper Reaches of the Kanna River, Japan

Authors: Sakura Yoshii, Masakazu Abe, Akihiro Iijima

Abstract:

Nitrogen saturation has become one of the serious issues in the field of forest environment. The watershed protection forests located in the downwind hinterland of Tokyo Metropolitan Area are believed to be facing nitrogen saturation. In this study, we carefully focus on the balance of nitrogen between load and runoff. Annual nitrogen load via atmospheric deposition was estimated to 461.1 t-N/year in the upper reaches of the Kanna River. Annual nitrogen runoff to the forested headwater stream of the Kanna River was determined to 184.9 t-N/year, corresponding to 40.1% of the total nitrogen load. Clear seasonal change in NO3-N concentration was still observed. Therefore, watershed protection forest of the Kanna River is most likely to be in Stage-1 on the status of nitrogen saturation.

Keywords: atmospheric deposition, nitrogen accumulation, denitrification, forest ecosystems

Procedia PDF Downloads 258
24437 Identifying Critical Success Factors for Data Quality Management through a Delphi Study

Authors: Maria Paula Santos, Ana Lucas

Abstract:

Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.

Keywords: critical success factors, data quality, data quality management, Delphi, Q-Sort

Procedia PDF Downloads 200
24436 Technical Aspects of Closing the Loop in Depth-of-Anesthesia Control

Authors: Gorazd Karer

Abstract:

When performing a diagnostic procedure or surgery in general anesthesia (GA), a proper introduction and dosing of anesthetic agents are one of the main tasks of the anesthesiologist. However, depth of anesthesia (DoA) also seems to be a suitable process for closed-loop control implementation. To implement such a system, one must be able to acquire the relevant signals online and in real-time, as well as stream the calculated control signal to the infusion pump. However, during a procedure, patient monitors and infusion pumps are purposely unable to connect to an external (possibly medically unapproved) device for safety reasons, thus preventing closed-loop control. The paper proposes a conceptual solution to the aforementioned problem. First, it presents some important aspects of contemporary clinical practice. Next, it introduces the closed-loop-control-system structure and the relevant information flow. Focusing on transferring the data from the patient to the computer, it presents a non-invasive image-based system for signal acquisition from a patient monitor for online depth-of-anesthesia assessment. Furthermore, it introduces a UDP-based communication method that can be used for transmitting the calculated anesthetic inflow to the infusion pump. The proposed system is independent of a medical device manufacturer and is implemented in Matlab-Simulink, which can be conveniently used for DoA control implementation. The proposed scheme has been tested in a simulated GA setting and is ready to be evaluated in an operating theatre. However, the proposed system is only a step towards a proper closed-loop control system for DoA, which could routinely be used in clinical practice.

Keywords: closed-loop control, depth of anesthesia (DoA), modeling, optical signal acquisition, patient state index (PSi), UDP communication protocol

Procedia PDF Downloads 199
24435 Chemical Study of Volatile Organic Compounds (VOCS) from Xylopia aromatica (LAM.) Mart (Annonaceae)

Authors: Vanessa G. P. Severino, JOÃO Gabriel M. Junqueira, Michelle N. G. do Nascimento, Francisco W. B. Aquino, João B. Fernandes, Ana P. Terezan

Abstract:

The scientific interest in analyzing VOCs represents a significant modern research field as a result of importance in most branches of the present life and industry. Therefore it is extremely important to investigate, identify and isolate volatile substances, since they can be used in different areas, such as food, medicine, cosmetics, perfumery, aromatherapy, pesticides, repellents and other household products through methods for extracting volatile constituents, such as solid phase microextraction (SPME), hydrodistillation (HD), solvent extraction (SE), Soxhlet extraction, supercritical fluid extraction (SFE), stream distillation (SD) and vacuum distillation (VD). The Chemometrics is an area of chemistry that uses statistical and mathematical tools for the planning and optimization of the experimental conditions, and to extract relevant chemical information multivariate chemical data. In this context, the focus of this work was the study of the chemical VOCs by SPME of the specie X. aromatica, in search of constituents that can be used in the industrial sector as well as in food, cosmetics and perfumery, since these areas industrial has a considerable role. In addition, by chemometric analysis, we sought to maximize the answers of this research, in order to search for the largest number of compounds. The investigation of flowers from X. aromatica in vitro and in alive mode proved consistent, but certain factors supposed influence the composition of metabolites, and the chemometric analysis strengthened the analysis. Thus, the study of the chemical composition of X. aromatica contributed to the VOCs knowledge of the species and a possible application.

Keywords: chemometrics, flowers, HS-SPME, Xylopia aromatica

Procedia PDF Downloads 341
24434 Morphotectonic Analysis of Burkh Anticline, North of Bastak, Zagros

Authors: A. Afroogh, R. Ramazani omali, N. Hafezi Moghaddas, A. Nohegar

Abstract:

The Burkh anticline with a length of 50 km and a width of 9 km is located 40 km to the north of Bastak in internal Fars zone in folded-trusted belt of Zagros. In order to assess the active tectonics in the area of study, morphometrical indexes such as V indexes (V), ratio of valley floor to valley width (Vf), the stream length-gradient ratio (Sl), channel sinuosity indexes (S), mountain front faceting indexes (F%) and mountain front sinuosity(Smf) have been studied. These investigations show that the activity is not equal in various sections of the length of Burkh anticline. The central part of this anticline is the most active one.

Keywords: anticline, internal fars zone, tectonic, morohometrical indexes, folded-trusted belt

Procedia PDF Downloads 233
24433 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: biomedical data, learning, classifier, algorithms decision tree, knowledge extraction

Procedia PDF Downloads 534
24432 Analysis of Different Classification Techniques Using WEKA for Diabetic Disease

Authors: Usama Ahmed

Abstract:

Data mining is the process of analyze data which are used to predict helpful information. It is the field of research which solve various type of problem. In data mining, classification is an important technique to classify different kind of data. Diabetes is most common disease. This paper implements different classification technique using Waikato Environment for Knowledge Analysis (WEKA) on diabetes dataset and find which algorithm is suitable for working. The best classification algorithm based on diabetic data is Naïve Bayes. The accuracy of Naïve Bayes is 76.31% and take 0.06 seconds to build the model.

Keywords: data mining, classification, diabetes, WEKA

Procedia PDF Downloads 132
24431 Heat Transfer Modeling of 'Carabao' Mango (Mangifera indica L.) during Postharvest Hot Water Treatments

Authors: Hazel James P. Agngarayngay, Arnold R. Elepaño

Abstract:

Mango is the third most important export fruit in the Philippines. Despite the expanding mango trade in world market, problems on postharvest losses caused by pests and diseases are still prevalent. Many disease control and pest disinfestation methods have been studied and adopted. Heat treatment is necessary to eliminate pests and diseases to be able to pass the quarantine requirements of importing countries. During heat treatments, temperature and time are critical because fruits can easily be damaged by over-exposure to heat. Modeling the process enables researchers and engineers to study the behaviour of temperature distribution within the fruit over time. Understanding physical processes through modeling and simulation also saves time and resources because of reduced experimentation. This research aimed to simulate the heat transfer mechanism and predict the temperature distribution in ‘Carabao' mangoes during hot water treatment (HWT) and extended hot water treatment (EHWT). The simulation was performed in ANSYS CFD Software, using ANSYS CFX Solver. The simulation process involved model creation, mesh generation, defining the physics of the model, solving the problem, and visualizing the results. Boundary conditions consisted of the convective heat transfer coefficient and a constant free stream temperature. The three-dimensional energy equation for transient conditions was numerically solved to obtain heat flux and transient temperature values. The solver utilized finite volume method of discretization. To validate the simulation, actual data were obtained through experiment. The goodness of fit was evaluated using mean temperature difference (MTD). Also, t-test was used to detect significant differences between the data sets. Results showed that the simulations were able to estimate temperatures accurately with MTD of 0.50 and 0.69 °C for the HWT and EHWT, respectively. This indicates good agreement between the simulated and actual temperature values. The data included in the analysis were taken at different locations of probe punctures within the fruit. Moreover, t-tests showed no significant differences between the two data sets. Maximum heat fluxes obtained at the beginning of the treatments were 394.15 and 262.77 J.s-1 for HWT and EHWT, respectively. These values decreased abruptly at the first 10 seconds and gradual decrease was observed thereafter. Data on heat flux is necessary in the design of heaters. If underestimated, the heating component of a certain machine will not be able to provide enough heat required by certain operations. Otherwise, over-estimation will result in wasting of energy and resources. This study demonstrated that the simulation was able to estimate temperatures accurately. Thus, it can be used to evaluate the influence of various treatment conditions on the temperature-time history in mangoes. When combined with information on insect mortality and quality degradation kinetics, it could predict the efficacy of a particular treatment and guide appropriate selection of treatment conditions. The effect of various parameters on heat transfer rates, such as the boundary and initial conditions as well as the thermal properties of the material, can be systematically studied without performing experiments. Furthermore, the use of ANSYS software in modeling and simulation can be explored in modeling various systems and processes.

Keywords: heat transfer, heat treatment, mango, modeling and simulation

Procedia PDF Downloads 235
24430 Comprehensive Study of Data Science

Authors: Asifa Amara, Prachi Singh, Kanishka, Debargho Pathak, Akshat Kumar, Jayakumar Eravelly

Abstract:

Today's generation is totally dependent on technology that uses data as its fuel. The present study is all about innovations and developments in data science and gives an idea about how efficiently to use the data provided. This study will help to understand the core concepts of data science. The concept of artificial intelligence was introduced by Alan Turing in which the main principle was to create an artificial system that can run independently of human-given programs and can function with the help of analyzing data to understand the requirements of the users. Data science comprises business understanding, analyzing data, ethical concerns, understanding programming languages, various fields and sources of data, skills, etc. The usage of data science has evolved over the years. In this review article, we have covered a part of data science, i.e., machine learning. Machine learning uses data science for its work. Machines learn through their experience, which helps them to do any work more efficiently. This article includes a comparative study image between human understanding and machine understanding, advantages, applications, and real-time examples of machine learning. Data science is an important game changer in the life of human beings. Since the advent of data science, we have found its benefits and how it leads to a better understanding of people, and how it cherishes individual needs. It has improved business strategies, services provided by them, forecasting, the ability to attend sustainable developments, etc. This study also focuses on a better understanding of data science which will help us to create a better world.

Keywords: data science, machine learning, data analytics, artificial intelligence

Procedia PDF Downloads 59
24429 A Deep Learning Approach to Subsection Identification in Electronic Health Records

Authors: Nitin Shravan, Sudarsun Santhiappan, B. Sivaselvan

Abstract:

Subsection identification, in the context of Electronic Health Records (EHRs), is identifying the important sections for down-stream tasks like auto-coding. In this work, we classify the text present in EHRs according to their information, using machine learning and deep learning techniques. We initially describe briefly about the problem and formulate it as a text classification problem. Then, we discuss upon the methods from the literature. We try two approaches - traditional feature extraction based machine learning methods and deep learning methods. Through experiments on a private dataset, we establish that the deep learning methods perform better than the feature extraction based Machine Learning Models.

Keywords: deep learning, machine learning, semantic clinical classification, subsection identification, text classification

Procedia PDF Downloads 194
24428 Application of Artificial Neural Network Technique for Diagnosing Asthma

Authors: Azadeh Bashiri

Abstract:

Introduction: Lack of proper diagnosis and inadequate treatment of asthma leads to physical and financial complications. This study aimed to use data mining techniques and creating a neural network intelligent system for diagnosis of asthma. Methods: The study population is the patients who had visited one of the Lung Clinics in Tehran. Data were analyzed using the SPSS statistical tool and the chi-square Pearson's coefficient was the basis of decision making for data ranking. The considered neural network is trained using back propagation learning technique. Results: According to the analysis performed by means of SPSS to select the top factors, 13 effective factors were selected, in different performances, data was mixed in various forms, so the different models were made for training the data and testing networks and in all different modes, the network was able to predict correctly 100% of all cases. Conclusion: Using data mining methods before the design structure of system, aimed to reduce the data dimension and the optimum choice of the data, will lead to a more accurate system. Therefore, considering the data mining approaches due to the nature of medical data is necessary.

Keywords: asthma, data mining, Artificial Neural Network, intelligent system

Procedia PDF Downloads 256
24427 Edible Oil Industry Wastewater Treatment by Microfiltration with Ceramic Membrane

Authors: Zita Šereš, Dragana Šoronja Simović, Ljubica Dokić, Lidietta Giorno, Biljana Pajin, Cecilia Hodur, Nikola Maravić

Abstract:

Membrane technology is convenient for separation of suspended solids, colloids and high molecular weight materials that are present. The idea is that the waste stream from edible oil industry, after the separation of oil by using skimmers is subjected to microfiltration and the obtained permeate can be used again in the production process. The wastewater from edible oil industry was used for the microfiltration. For the microfiltration of this effluent a tubular membrane was used with a pore size of 200 nm at transmembrane pressure in range up to 3 bar and in range of flow rate up to 300 L/h. Box–Behnken design was selected for the experimental work and the responses considered were permeate flux and chemical oxygen demand (COD) reduction. The reduction of the permeate COD was in the range 40-60% according to the feed. The highest permeate flux achieved during the process of microfiltration was 160 L/m2h.

Keywords: ceramic membrane, edible oil, microfiltration, wastewater

Procedia PDF Downloads 276
24426 Simulation Model of Biosensor Based on Gold Nanoparticles

Authors: Kholod Hajo

Abstract:

In this study COMSOL Multiphysics was used to design lateral flow biosensors (LFBs) which provide advantages in low cost, simplicity, rapidity, stability and portability thus making LFBs popular in biomedical, agriculture, food and environmental sciences. This study was focused on simulation model of biosensor based on gold nanoparticles (GNPs) designed using software package (COMSOL Multiphysics), the magnitude of the laminar velocity field in the flow cell, concentration distribution in the analyte stream and surface coverage of adsorbed species and average fractional surface coverage of adsorbed analyte were discussed from the model and couples of suggestion was given in order to functionalize GNPs and to increase the accuracy of the biosensor design, all above were obtained acceptable results.

Keywords: model, gold nanoparticles, biosensor, COMSOL Multiphysics

Procedia PDF Downloads 239
24425 Interpreting Privacy Harms from a Non-Economic Perspective

Authors: Christopher Muhawe, Masooda Bashir

Abstract:

With increased Internet Communication Technology(ICT), the virtual world has become the new normal. At the same time, there is an unprecedented collection of massive amounts of data by both private and public entities. Unfortunately, this increase in data collection has been in tandem with an increase in data misuse and data breach. Regrettably, the majority of data breach and data misuse claims have been unsuccessful in the United States courts for the failure of proof of direct injury to physical or economic interests. The requirement to express data privacy harms from an economic or physical stance negates the fact that not all data harms are physical or economic in nature. The challenge is compounded by the fact that data breach harms and risks do not attach immediately. This research will use a descriptive and normative approach to show that not all data harms can be expressed in economic or physical terms. Expressing privacy harms purely from an economic or physical harm perspective negates the fact that data insecurity may result into harms which run counter the functions of privacy in our lives. The promotion of liberty, selfhood, autonomy, promotion of human social relations and the furtherance of the existence of a free society. There is no economic value that can be placed on these functions of privacy. The proposed approach addresses data harms from a psychological and social perspective.

Keywords: data breach and misuse, economic harms, privacy harms, psychological harms

Procedia PDF Downloads 174
24424 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 28
24423 When Change Is the Only Constant: The Impact of Change Frequency and Diversity on Change Appraisal

Authors: Danika Pieters

Abstract:

Due to changing societal and economic demands, organizational change has become increasingly prevalent in work life. While a long time change research has focused on the effects of single discrete change events on different employee outcomes such as job satisfaction and organizational commitment, a nascent research stream has begun to look into the potential cumulative effects of change in the context of continuous intense reforms. This case study of a large Belgian public organization aims to add to this growing literature by examining how the frequency and diversity of past changes impact employees’ appraisals of a newly introduced change. Twelve hundred survey results were analyzed using standard ordinary least squares regression. Results showed a correlation between high past change frequency and diversity and a negative appraisal of the new change. Implications for practitioners and future research are discussed.

Keywords: change frequency, change diversity, organizational changes, change appraisal, change evaluation

Procedia PDF Downloads 114
24422 Data Access, AI Intensity, and Scale Advantages

Authors: Chuping Lo

Abstract:

This paper presents a simple model demonstrating that ceteris paribus countries with lower barriers to accessing global data tend to earn higher incomes than other countries. Therefore, large countries that inherently have greater data resources tend to have higher incomes than smaller countries, such that the former may be more hesitant than the latter to liberalize cross-border data flows to maintain this advantage. Furthermore, countries with higher artificial intelligence (AI) intensity in production technologies tend to benefit more from economies of scale in data aggregation, leading to higher income and more trade as they are better able to utilize global data.

Keywords: digital intensity, digital divide, international trade, scale of economics

Procedia PDF Downloads 48
24421 Secured Transmission and Reserving Space in Images Before Encryption to Embed Data

Authors: G. R. Navaneesh, E. Nagarajan, C. H. Rajam Raju

Abstract:

Nowadays the multimedia data are used to store some secure information. All previous methods allocate a space in image for data embedding purpose after encryption. In this paper, we propose a novel method by reserving space in image with a boundary surrounded before encryption with a traditional RDH algorithm, which makes it easy for the data hider to reversibly embed data in the encrypted images. The proposed method can achieve real time performance, that is, data extraction and image recovery are free of any error. A secure transmission process is also discussed in this paper, which improves the efficiency by ten times compared to other processes as discussed.

Keywords: secure communication, reserving room before encryption, least significant bits, image encryption, reversible data hiding

Procedia PDF Downloads 395
24420 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 234
24419 A Review on Intelligent Systems for Geoscience

Authors: R Palson Kennedy, P.Kiran Sai

Abstract:

This article introduces machine learning (ML) researchers to the hurdles that geoscience problems present, as well as the opportunities for improvement in both ML and geosciences. This article presents a review from the data life cycle perspective to meet that need. Numerous facets of geosciences present unique difficulties for the study of intelligent systems. Geosciences data is notoriously difficult to analyze since it is frequently unpredictable, intermittent, sparse, multi-resolution, and multi-scale. The first half addresses data science’s essential concepts and theoretical underpinnings, while the second section contains key themes and sharing experiences from current publications focused on each stage of the data life cycle. Finally, themes such as open science, smart data, and team science are considered.

Keywords: Data science, intelligent system, machine learning, big data, data life cycle, recent development, geo science

Procedia PDF Downloads 122
24418 Steady and Oscillatory States of Swirling Flows under an Axial Magnetic Field

Authors: Brahim Mahfoud, Rachid Bessaïh

Abstract:

In this paper, a numerical study of steady and oscillatory flows with heat transfer submitted to an axial magnetic field is studied. The governing Navier-Stokes, energy, and potential equations along with appropriate boundary conditions are solved by using the finite-volume method. The flow and temperature fields are presented by stream function and isotherms, respectively. The flow between counter-rotating end disks is very unstable and reveals a great richness of structures. The results are presented for various values of the Hartmann number, Ha=5, 10, 20, and 30, and Richardson numbers , Ri=0, 0.5, 1, 2, and 4, in order to see their effects on the value of the critical Reynolds number, Recr. Stability diagrams are established according to the numerical results of this investigation. These diagrams put in evidence the dependence of Recr with the increase of Ha for various values of Ri.

Keywords: swirling, counter-rotating end disks, magnetic field, oscillatory, cylinder

Procedia PDF Downloads 311
24417 Propylene Self-Metathesis to Ethylene and Butene over WOx/SiO2, Effect of Nano-Sized Extra Supports (SiO2 and TiO2)

Authors: Adisak Guntida

Abstract:

Propylene self-metathesis to ethylene and butene was studied over WOx/SiO2 catalysts at 450 °C and atmospheric pressure. The WOx/SiO2 catalysts were prepared by incipient wetness impregnation of ammonium metatungstate aqueous solution. It was found that, adding nano-sized extra supports (SiO2 and TiO2) by physical mixing with the WOx/SiO2 enhanced propylene conversion. The UV-Vis and FT-Raman results revealed that WOx could migrate from the original silica support to the extra support, leading to a better dispersion of WOx. The ICP-OES results also indicate that WOx existed on the extra support. Coke formation was investigated on the catalysts after 10 h time-on-stream by TPO. However, adding nano-sized extra supports led to higher coke formation which may be related to acidity as characterized by NH3-TPD.

Keywords: extra support, nanomaterial, propylene self-metathesis, tungsten oxide

Procedia PDF Downloads 235
24416 Estimation of Ribb Dam Catchment Sediment Yield and Reservoir Effective Life Using Soil and Water Assessment Tool Model and Empirical Methods

Authors: Getalem E. Haylia

Abstract:

The Ribb dam is one of the irrigation projects in the Upper Blue Nile basin, Ethiopia, to irrigate the Fogera plain. Reservoir sedimentation is a major problem because it reduces the useful reservoir capacity by the accumulation of sediments coming from the watersheds. Estimates of sediment yield are needed for studies of reservoir sedimentation and planning of soil and water conservation measures. The objective of this study was to simulate the Ribb dam catchment sediment yield using SWAT model and to estimate Ribb reservoir effective life according to trap efficiency methods. The Ribb dam catchment is found in North Western part of Ethiopia highlands, and it belongs to the upper Blue Nile and Lake Tana basins. Soil and Water Assessment Tool (SWAT) was selected to simulate flow and sediment yield in the Ribb dam catchment. The model sensitivity, calibration, and validation analysis at Ambo Bahir site were performed with Sequential Uncertainty Fitting (SUFI-2). The flow data at this site was obtained by transforming the Lower Ribb gauge station (2002-2013) flow data using Area Ratio Method. The sediment load was derived based on the sediment concentration yield curve of Ambo site. Stream flow results showed that the Nash-Sutcliffe efficiency coefficient (NSE) was 0.81 and the coefficient of determination (R²) was 0.86 in calibration period (2004-2010) and, 0.74 and 0.77 in validation period (2011-2013), respectively. Using the same periods, the NS and R² for the sediment load calibration were 0.85 and 0.79 and, for the validation, it became 0.83 and 0.78, respectively. The simulated average daily flow rate and sediment yield generated from Ribb dam watershed were 3.38 m³/s and 1772.96 tons/km²/yr, respectively. The effective life of Ribb reservoir was estimated using the developed empirical methods of the Brune (1953), Churchill (1948) and Brown (1958) methods and found to be 30, 38 and 29 years respectively. To conclude, massive sediment comes from the steep slope agricultural areas, and approximately 98-100% of this incoming annual sediment loads have been trapped by the Ribb reservoir. In Ribb catchment, as well as reservoir systematic and thorough consideration of technical, social, environmental, and catchment managements and practices should be made to lengthen the useful life of Ribb reservoir.

Keywords: catchment, reservoir effective life, reservoir sedimentation, Ribb, sediment yield, SWAT model

Procedia PDF Downloads 164
24415 Data Quality as a Pillar of Data-Driven Organizations: Exploring the Benefits of Data Mesh

Authors: Marc Bachelet, Abhijit Kumar Chatterjee, José Manuel Avila

Abstract:

Data quality is a key component of any data-driven organization. Without data quality, organizations cannot effectively make data-driven decisions, which often leads to poor business performance. Therefore, it is important for an organization to ensure that the data they use is of high quality. This is where the concept of data mesh comes in. Data mesh is an organizational and architectural decentralized approach to data management that can help organizations improve the quality of data. The concept of data mesh was first introduced in 2020. Its purpose is to decentralize data ownership, making it easier for domain experts to manage the data. This can help organizations improve data quality by reducing the reliance on centralized data teams and allowing domain experts to take charge of their data. This paper intends to discuss how a set of elements, including data mesh, are tools capable of increasing data quality. One of the key benefits of data mesh is improved metadata management. In a traditional data architecture, metadata management is typically centralized, which can lead to data silos and poor data quality. With data mesh, metadata is managed in a decentralized manner, ensuring accurate and up-to-date metadata, thereby improving data quality. Another benefit of data mesh is the clarification of roles and responsibilities. In a traditional data architecture, data teams are responsible for managing all aspects of data, which can lead to confusion and ambiguity in responsibilities. With data mesh, domain experts are responsible for managing their own data, which can help provide clarity in roles and responsibilities and improve data quality. Additionally, data mesh can also contribute to a new form of organization that is more agile and adaptable. By decentralizing data ownership, organizations can respond more quickly to changes in their business environment, which in turn can help improve overall performance by allowing better insights into business as an effect of better reports and visualization tools. Monitoring and analytics are also important aspects of data quality. With data mesh, monitoring, and analytics are decentralized, allowing domain experts to monitor and analyze their own data. This will help in identifying and addressing data quality problems in quick time, leading to improved data quality. Data culture is another major aspect of data quality. With data mesh, domain experts are encouraged to take ownership of their data, which can help create a data-driven culture within the organization. This can lead to improved data quality and better business outcomes. Finally, the paper explores the contribution of AI in the coming years. AI can help enhance data quality by automating many data-related tasks, like data cleaning and data validation. By integrating AI into data mesh, organizations can further enhance the quality of their data. The concepts mentioned above are illustrated by AEKIDEN experience feedback. AEKIDEN is an international data-driven consultancy that has successfully implemented a data mesh approach. By sharing their experience, AEKIDEN can help other organizations understand the benefits and challenges of implementing data mesh and improving data quality.

Keywords: data culture, data-driven organization, data mesh, data quality for business success

Procedia PDF Downloads 115
24414 Assessment of Impact of Urbanization in High Mountain Urban Watersheds

Authors: D. M. Rey, V. Delgado, J. Zambrano Nájera

Abstract:

Increases in urbanization during XX century, has produced changes in natural dynamics of the basins, which has resulted in increases in runoff volumes, peak flows and flow velocities, that in turn increases flood risk. Higher runoff volumes decrease sewerage networks hydraulic capacity and can cause its failure. This in turn generates increasingly recurrent floods causing mobility problems and general economic detriment in the cities. In Latin America, especially Colombia, this is a major problem because urban population at late XX century was more than 70% is in urban areas increasing approximately in 790% in 1940-1990 period. Besides, high slopes product of Andean topography and high precipitation typical of tropical climates increases velocities and volumes even more, causing stopping of cities during storms. Thus, it becomes very important to know hydrological behavior of Andean Urban Watersheds. This research aims to determine the impact of urbanization in high sloped urban watersheds in its hydrology. To this end, it will be used as study area experimental urban watershed named Palogrande-San Luis watershed, located in the city of Manizales, Colombia. Manizales is a city in central western Colombia, located in Colombian Central Mountain Range (part of Los Andes Mountains) with an abrupt topography (average altitude is 2.153 m). The climate in Manizales is quite uniform, but due to its high altitude it presents high precipitations (1.545 mm/year average) with high humidity (83% average). It was applied HEC-HMS Hydrologic model on the watershed. The inputs to the model were derived from Geographic Information Systems (GIS) theme layers of the Instituto de Estudios Ambientales –IDEA of Universidad Nacional de Colombia, Manizales (Institute of Environmental Studies) and aerial photography taken for the research in conjunction with available literature and look up tables. Rainfall data from a network of 4 rain gages and historical stream flow data were used to calibrate and validate runoff depth using the hydrologic model. Manual calibration was made, and the simulation results show that the model selected is able to characterize the runoff response of the watershed due to land use for urbanization in high mountain watersheds.

Keywords: Andean watersheds modelling, high mountain urban hydrology, urban planning, hydrologic modelling

Procedia PDF Downloads 216
24413 Big Data Analysis with RHadoop

Authors: Ji Eun Shin, Byung Ho Jung, Dong Hoon Lim

Abstract:

It is almost impossible to store or analyze big data increasing exponentially with traditional technologies. Hadoop is a new technology to make that possible. R programming language is by far the most popular statistical tool for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with different sizes of actual data. Experimental results showed our RHadoop system was much faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and big lm packages available on big memory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.

Keywords: big data, Hadoop, parallel regression analysis, R, RHadoop

Procedia PDF Downloads 416