Search results for: data block
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25318

Search results for: data block

24838 The Utilization of Big Data in Knowledge Management Creation

Authors: Daniel Brian Thompson, Subarmaniam Kannan

Abstract:

The huge weightage of knowledge in this world and within the repository of organizations has already reached immense capacity and is constantly increasing as time goes by. To accommodate these constraints, Big Data implementation and algorithms are utilized to obtain new or enhanced knowledge for decision-making. With the transition from data to knowledge provides the transformational changes which will provide tangible benefits to the individual implementing these practices. Today, various organization would derive knowledge from observations and intuitions where this information or data will be translated into best practices for knowledge acquisition, generation and sharing. Through the widespread usage of Big Data, the main intention is to provide information that has been cleaned and analyzed to nurture tangible insights for an organization to apply to their knowledge-creation practices based on facts and figures. The translation of data into knowledge will generate value for an organization to make decisive decisions to proceed with the transition of best practices. Without a strong foundation of knowledge and Big Data, businesses are not able to grow and be enhanced within the competitive environment.

Keywords: big data, knowledge management, data driven, knowledge creation

Procedia PDF Downloads 98
24837 Survey on Data Security Issues Through Cloud Computing Amongst Sme’s in Nairobi County, Kenya

Authors: Masese Chuma Benard, Martin Onsiro Ronald

Abstract:

Businesses have been using cloud computing more frequently recently because they wish to take advantage of its advantages. However, employing cloud computing also introduces new security concerns, particularly with regard to data security, potential risks and weaknesses that could be exploited by attackers, and various tactics and strategies that could be used to lessen these risks. This study examines data security issues on cloud computing amongst sme’s in Nairobi county, Kenya. The study used the sample size of 48, the research approach was mixed methods, The findings show that data owner has no control over the cloud merchant's data management procedures, there is no way to ensure that data is handled legally. This implies that you will lose control over the data stored in the cloud. Data and information stored in the cloud may face a range of availability issues due to internet outages; this can represent a significant risk to data kept in shared clouds. Integrity, availability, and secrecy are all mentioned.

Keywords: data security, cloud computing, information, information security, small and medium-sized firms (SMEs)

Procedia PDF Downloads 71
24836 Cloud Design for Storing Large Amount of Data

Authors: M. Strémy, P. Závacký, P. Cuninka, M. Juhás

Abstract:

Main goal of this paper is to introduce our design of private cloud for storing large amount of data, especially pictures, and to provide good technological backend for data analysis based on parallel processing and business intelligence. We have tested hypervisors, cloud management tools, storage for storing all data and Hadoop to provide data analysis on unstructured data. Providing high availability, virtual network management, logical separation of projects and also rapid deployment of physical servers to our environment was also needed.

Keywords: cloud, glusterfs, hadoop, juju, kvm, maas, openstack, virtualization

Procedia PDF Downloads 343
24835 Highway Waste Management in Zambia Policy Preparedness and Remedies: The Case of Great East Road

Authors: Floyd Misheck Mwanza, Paul Boniface Majura

Abstract:

The paper looked at highways/ roadside waste generation, disposal and the consequent environmental impacts. The dramatic increase in vehicular and paved roads in the recent past in Zambia, has given rise to the indiscriminate disposal of litter that now poses a threat to health and the environment. Primary data was generated by carrying out oral interviews and field observations for holistic and in–depth assessment of the environment and the secondary data was obtained from desk review method, information on effects of roadside wastes on environment were obtained from relevant literatures. The interviews were semi structured and a purposive sampling method was adopted and analyzed descriptively. The results of the findings showed that population growth and unplanned road expansion has exceeded the expected limit in recent time with resultant poor system of roadside wastes disposal. Roadside wastes which contain both biodegradable and non-biodegradable roadside wastes are disposed at the shoulders of major highways in temporary dumpsites and are never collected by a road development agency (RDA). There is no organized highway to highway or street to street collection of the wastes in Zambia by the key organization the RDA. The study revealed that roadside disposal of roadside wastes has serious impacts on the environment. Some of these impacts include physical nuisance of the wastes to the environment, the waste dumps also serve as hideouts for rodents and snakes which are dangerous. Waste are blown around by wind making the environment filthy, most of the wastes are also been washed by overland flow during heavy downpour to block drainage channels and subsequently lead to flooding of the environment. Most of the non- biodegradable wastes contain toxic chemicals which have serious implications on the environmental sustainability and human health. The paper therefore recommends that Government/ RDA should come up with proper orientation and environmental laws should be put in place for the general public and also to provide necessary facilities and arrange for better methods of collection of wastes.

Keywords: biodegradable, disposal, environment, impacts

Procedia PDF Downloads 327
24834 Estimation of Missing Values in Aggregate Level Spatial Data

Authors: Amitha Puranik, V. S. Binu, Seena Biju

Abstract:

Missing data is a common problem in spatial analysis especially at the aggregate level. Missing can either occur in covariate or in response variable or in both in a given location. Many missing data techniques are available to estimate the missing data values but not all of these methods can be applied on spatial data since the data are autocorrelated. Hence there is a need to develop a method that estimates the missing values in both response variable and covariates in spatial data by taking account of the spatial autocorrelation. The present study aims to develop a model to estimate the missing data points at the aggregate level in spatial data by accounting for (a) Spatial autocorrelation of the response variable (b) Spatial autocorrelation of covariates and (c) Correlation between covariates and the response variable. Estimating the missing values of spatial data requires a model that explicitly account for the spatial autocorrelation. The proposed model not only accounts for spatial autocorrelation but also utilizes the correlation that exists between covariates, within covariates and between a response variable and covariates. The precise estimation of the missing data points in spatial data will result in an increased precision of the estimated effects of independent variables on the response variable in spatial regression analysis.

Keywords: spatial regression, missing data estimation, spatial autocorrelation, simulation analysis

Procedia PDF Downloads 364
24833 Association Rules Mining and NOSQL Oriented Document in Big Data

Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub

Abstract:

Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.

Keywords: Apriori, Association rules mining, Big Data, Data Mining, Hadoop, MapReduce, MongoDB, NoSQL

Procedia PDF Downloads 149
24832 Immunization-Data-Quality in Public Health Facilities in the Pastoralist Communities: A Comparative Study Evidence from Afar and Somali Regional States, Ethiopia

Authors: Melaku Tsehay

Abstract:

The Consortium of Christian Relief and Development Associations (CCRDA), and the CORE Group Polio Partners (CGPP) Secretariat have been working with Global Alliance for Vac-cines and Immunization (GAVI) to improve the immunization data quality in Afar and Somali Regional States. The main aim of this study was to compare the quality of immunization data before and after the above interventions in health facilities in the pastoralist communities in Ethiopia. To this end, a comparative-cross-sectional study was conducted on 51 health facilities. The baseline data was collected in May 2019, while the end line data in August 2021. The WHO data quality self-assessment tool (DQS) was used to collect data. A significant improvment was seen in the accuracy of the pentavalent vaccine (PT)1 (p = 0.012) data at the health posts (HP), while PT3 (p = 0.010), and Measles (p = 0.020) at the health centers (HC). Besides, a highly sig-nificant improvment was observed in the accuracy of tetanus toxoid (TT)2 data at HP (p < 0.001). The level of over- or under-reporting was found to be < 8%, at the HP, and < 10% at the HC for PT3. The data completeness was also increased from 72.09% to 88.89% at the HC. Nearly 74% of the health facilities timely reported their respective immunization data, which is much better than the baseline (7.1%) (p < 0.001). These findings may provide some hints for the policies and pro-grams targetting on improving immunization data qaulity in the pastoralist communities.

Keywords: data quality, immunization, verification factor, pastoralist region

Procedia PDF Downloads 90
24831 Identifying Critical Success Factors for Data Quality Management through a Delphi Study

Authors: Maria Paula Santos, Ana Lucas

Abstract:

Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.

Keywords: critical success factors, data quality, data quality management, Delphi, Q-Sort

Procedia PDF Downloads 204
24830 Repurposing Dairy Manure Solids as a Non- Polluting Fertilizer and the Effects on Nutrient Recovery in Tomatoes (Solanum Lycopersicum)

Authors: Devon Simpson

Abstract:

Recycled Manure Solids (RMS), attained via centrifugation from Canadian dairy farms, were synthesized into a non-polluting fertilizer by bonding micronutrients (Fe, Zn, and Mn) to cellulose fibers and then assessed for the effectiveness of nutrient recovery in tomatoes. Manure management technology is critical for improving the sustainability of agroecosystems and has the capacity to offer a truly circular economy. The ability to add value to manure byproducts offers an opportunity for economic benefits while generating tenable solutions to livestock waste. The dairy industry is under increasing pressure from new environmental protections such as government restrictions on manure applications, limitations on herd size as well as increased product demand from a growing population. Current systems use RMS as bedding, so there is a lack of data pertaining to RMS use as a fertilizer. This is because of nutrient distribution, where most nutrients are retained in the liquid effluent of the solid-liquid separation. A literature review on the physical and chemical properties of dairy manure further revealed more data for raw manure than centrifuged solids. This research offers an innovative perspective and a new avenue of exploration in the use of RMS. Manure solids in this study were obtained directly from dairy farms in Salmon Arm and Abbotsford, British Columbia, and underwent physical, chemical, and biological characterizations pre- and post-synthesis processing. Samples were sent to A&L labs Canada for analysis. Once characterized and bonded to micronutrients, the effect of synthesized RMS on nutrient recovery in tomatoes was studied in a greenhouse environment. The agricultural research package ‘agricolae’ for R was used for experimental design and data analysis. The growth trials consisted of a randomized complete block design (RCBD) that allowed for analysis of variance (ANOVA). The primary outcome was to measure nutrient uptake, and this was done using an Inductively Coupled Plasma Mass Spectrometer (IC-PMS) to analyze the micronutrient content of both the tissue and fruit of the tomatoes. It was found that treatments containing bonded dairy manure solids had an increased micronutrient concentration. Treatments with bonded dairy manure solids also saw an increase in yield, and a brix analysis showed higher sugar content than the untreated control and a grower standard.

Keywords: aoecosystems, dairy manure, micronutrient fertilizer, manure management, nutrient recovery, nutrient recycling, recycled manure solids, regenerative agricugrlture, sustainable farming

Procedia PDF Downloads 175
24829 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: biomedical data, learning, classifier, algorithms decision tree, knowledge extraction

Procedia PDF Downloads 541
24828 Analysis of Different Classification Techniques Using WEKA for Diabetic Disease

Authors: Usama Ahmed

Abstract:

Data mining is the process of analyze data which are used to predict helpful information. It is the field of research which solve various type of problem. In data mining, classification is an important technique to classify different kind of data. Diabetes is most common disease. This paper implements different classification technique using Waikato Environment for Knowledge Analysis (WEKA) on diabetes dataset and find which algorithm is suitable for working. The best classification algorithm based on diabetic data is Naïve Bayes. The accuracy of Naïve Bayes is 76.31% and take 0.06 seconds to build the model.

Keywords: data mining, classification, diabetes, WEKA

Procedia PDF Downloads 134
24827 Comprehensive Evaluation of COVID-19 Through Chest Images

Authors: Parisa Mansour

Abstract:

The coronavirus disease 2019 (COVID-19) was discovered and rapidly spread to various countries around the world since the end of 2019. Computed tomography (CT) images have been used as an important alternative to the time-consuming RT. PCR test. However, manual segmentation of CT images alone is a major challenge as the number of suspected cases increases. Thus, accurate and automatic segmentation of COVID-19 infections is urgently needed. Because the imaging features of the COVID-19 infection are different and similar to the background, existing medical image segmentation methods cannot achieve satisfactory performance. In this work, we try to build a deep convolutional neural network adapted for the segmentation of chest CT images with COVID-19 infections. First, we maintain a large and novel chest CT image database containing 165,667 annotated chest CT images from 861 patients with confirmed COVID-19. Inspired by the observation that the boundary of an infected lung can be improved by global intensity adjustment, we introduce a feature variable block into the proposed deep CNN, which adjusts the global features of features to segment the COVID-19 infection. The proposed PV array can effectively and adaptively improve the performance of functions in different cases. We combine features of different scales by proposing a progressive atrocious space pyramid fusion scheme to deal with advanced infection regions with various aspects and shapes. We conducted experiments on data collected in China and Germany and showed that the proposed deep CNN can effectively produce impressive performance.

Keywords: chest, COVID-19, chest Image, coronavirus, CT image, chest CT

Procedia PDF Downloads 41
24826 Comprehensive Study of Data Science

Authors: Asifa Amara, Prachi Singh, Kanishka, Debargho Pathak, Akshat Kumar, Jayakumar Eravelly

Abstract:

Today's generation is totally dependent on technology that uses data as its fuel. The present study is all about innovations and developments in data science and gives an idea about how efficiently to use the data provided. This study will help to understand the core concepts of data science. The concept of artificial intelligence was introduced by Alan Turing in which the main principle was to create an artificial system that can run independently of human-given programs and can function with the help of analyzing data to understand the requirements of the users. Data science comprises business understanding, analyzing data, ethical concerns, understanding programming languages, various fields and sources of data, skills, etc. The usage of data science has evolved over the years. In this review article, we have covered a part of data science, i.e., machine learning. Machine learning uses data science for its work. Machines learn through their experience, which helps them to do any work more efficiently. This article includes a comparative study image between human understanding and machine understanding, advantages, applications, and real-time examples of machine learning. Data science is an important game changer in the life of human beings. Since the advent of data science, we have found its benefits and how it leads to a better understanding of people, and how it cherishes individual needs. It has improved business strategies, services provided by them, forecasting, the ability to attend sustainable developments, etc. This study also focuses on a better understanding of data science which will help us to create a better world.

Keywords: data science, machine learning, data analytics, artificial intelligence

Procedia PDF Downloads 69
24825 Optimization of Highly Oriented Pyrolytic Graphite Crystals for Neutron Optics

Authors: Hao Qu, Xiang Liu, Michael Crosby, Brian Kozak, Andreas K. Freund

Abstract:

The outstanding performance of highly oriented pyrolytic graphite (HOPG) as an optical element for neutron beam conditioning is unequaled by any other crystalline material in the applications of monochromator, analyzer, and filter. This superiority stems from the favorable nuclear properties of carbon (small absorption and incoherent scattering cross-sections, big coherent scattering length) and the specific crystalline structure (small thermal diffuse scattering cross-section, layered crystal structure). The real crystal defect structure revealed by imaging techniques is correlated with the parameters used in the mosaic model (mosaic spread, mosaic block size, uniformity). The diffraction properties (rocking curve width as determined by both the intrinsic mosaic spread and the diffraction process, peak and integrated reflectivity, filter transmission) as a function of neutron wavelength or energy can be predicted with high accuracy and reliability by diffraction theory using empirical primary extinction coefficients extracted from a great amount of existing experimental data. The results of these calculations are given as graphs and tables permitting to optimize HOPG characteristics (mosaic spread, thickness, curvature) for any given experimental situation.

Keywords: neutron optics, pyrolytic graphite, mosaic spread, neutron scattering, monochromator, analyzer

Procedia PDF Downloads 128
24824 Enhancing Scalability in Ethereum Network Analysis: Methods and Techniques

Authors: Stefan K. Behfar

Abstract:

The rapid growth of the Ethereum network has brought forth the urgent need for scalable analysis methods to handle the increasing volume of blockchain data. In this research, we propose efficient methodologies for making Ethereum network analysis scalable. Our approach leverages a combination of graph-based data representation, probabilistic sampling, and parallel processing techniques to achieve unprecedented scalability while preserving critical network insights. Data Representation: We develop a graph-based data representation that captures the underlying structure of the Ethereum network. Each block transaction is represented as a node in the graph, while the edges signify temporal relationships. This representation ensures efficient querying and traversal of the blockchain data. Probabilistic Sampling: To cope with the vastness of the Ethereum blockchain, we introduce a probabilistic sampling technique. This method strategically selects a representative subset of transactions and blocks, allowing for concise yet statistically significant analysis. The sampling approach maintains the integrity of the network properties while significantly reducing the computational burden. Graph Convolutional Networks (GCNs): We incorporate GCNs to process the graph-based data representation efficiently. The GCN architecture enables the extraction of complex spatial and temporal patterns from the sampled data. This combination of graph representation and GCNs facilitates parallel processing and scalable analysis. Distributed Computing: To further enhance scalability, we adopt distributed computing frameworks such as Apache Hadoop and Apache Spark. By distributing computation across multiple nodes, we achieve a significant reduction in processing time and enhanced memory utilization. Our methodology harnesses the power of parallelism, making it well-suited for large-scale Ethereum network analysis. Evaluation and Results: We extensively evaluate our methodology on real-world Ethereum datasets covering diverse time periods and transaction volumes. The results demonstrate its superior scalability, outperforming traditional analysis methods. Our approach successfully handles the ever-growing Ethereum data, empowering researchers and developers with actionable insights from the blockchain. Case Studies: We apply our methodology to real-world Ethereum use cases, including detecting transaction patterns, analyzing smart contract interactions, and predicting network congestion. The results showcase the accuracy and efficiency of our approach, emphasizing its practical applicability in real-world scenarios. Security and Robustness: To ensure the reliability of our methodology, we conduct thorough security and robustness evaluations. Our approach demonstrates high resilience against adversarial attacks and perturbations, reaffirming its suitability for security-critical blockchain applications. Conclusion: By integrating graph-based data representation, GCNs, probabilistic sampling, and distributed computing, we achieve network scalability without compromising analytical precision. This approach addresses the pressing challenges posed by the expanding Ethereum network, opening new avenues for research and enabling real-time insights into decentralized ecosystems. Our work contributes to the development of scalable blockchain analytics, laying the foundation for sustainable growth and advancement in the domain of blockchain research and application.

Keywords: Ethereum, scalable network, GCN, probabilistic sampling, distributed computing

Procedia PDF Downloads 59
24823 Application of Artificial Neural Network Technique for Diagnosing Asthma

Authors: Azadeh Bashiri

Abstract:

Introduction: Lack of proper diagnosis and inadequate treatment of asthma leads to physical and financial complications. This study aimed to use data mining techniques and creating a neural network intelligent system for diagnosis of asthma. Methods: The study population is the patients who had visited one of the Lung Clinics in Tehran. Data were analyzed using the SPSS statistical tool and the chi-square Pearson's coefficient was the basis of decision making for data ranking. The considered neural network is trained using back propagation learning technique. Results: According to the analysis performed by means of SPSS to select the top factors, 13 effective factors were selected, in different performances, data was mixed in various forms, so the different models were made for training the data and testing networks and in all different modes, the network was able to predict correctly 100% of all cases. Conclusion: Using data mining methods before the design structure of system, aimed to reduce the data dimension and the optimum choice of the data, will lead to a more accurate system. Therefore, considering the data mining approaches due to the nature of medical data is necessary.

Keywords: asthma, data mining, Artificial Neural Network, intelligent system

Procedia PDF Downloads 261
24822 Interpreting Privacy Harms from a Non-Economic Perspective

Authors: Christopher Muhawe, Masooda Bashir

Abstract:

With increased Internet Communication Technology(ICT), the virtual world has become the new normal. At the same time, there is an unprecedented collection of massive amounts of data by both private and public entities. Unfortunately, this increase in data collection has been in tandem with an increase in data misuse and data breach. Regrettably, the majority of data breach and data misuse claims have been unsuccessful in the United States courts for the failure of proof of direct injury to physical or economic interests. The requirement to express data privacy harms from an economic or physical stance negates the fact that not all data harms are physical or economic in nature. The challenge is compounded by the fact that data breach harms and risks do not attach immediately. This research will use a descriptive and normative approach to show that not all data harms can be expressed in economic or physical terms. Expressing privacy harms purely from an economic or physical harm perspective negates the fact that data insecurity may result into harms which run counter the functions of privacy in our lives. The promotion of liberty, selfhood, autonomy, promotion of human social relations and the furtherance of the existence of a free society. There is no economic value that can be placed on these functions of privacy. The proposed approach addresses data harms from a psychological and social perspective.

Keywords: data breach and misuse, economic harms, privacy harms, psychological harms

Procedia PDF Downloads 183
24821 Seismic Assessment of a Pre-Cast Recycled Concrete Block Arch System

Authors: Amaia Martinez Martinez, Martin Turek, Carlos Ventura, Jay Drew

Abstract:

This study aims to assess the seismic performance of arch and dome structural systems made from easy to assemble precast blocks of recycled concrete. These systems have been developed by Lock Block Ltd. Company from Vancouver, Canada, as an extension of their currently used retaining wall system. The characterization of the seismic behavior of these structures is performed by a combination of experimental static and dynamic testing, and analytical modeling. For the experimental testing, several tilt tests, as well as a program of shake table testing were undertaken using small scale arch models. A suite of earthquakes with different characteristics from important past events are chosen and scaled properly for the dynamic testing. Shake table testing applying the ground motions in just one direction (in the weak direction of the arch) and in the three directions were conducted and compared. The models were tested with increasing intensity until collapse occurred; which determines the failure level for each earthquake. Since the failure intensity varied with type of earthquake, a sensitivity analysis of the different parameters was performed, being impulses the dominant factor. For all cases, the arches exhibited the typical four-hinge failure mechanism, which was also shown in the analytical model. Experimental testing was also performed reinforcing the arches using a steel band over the structures anchored at both ends of the arch. The models were tested with different pretension levels. The bands were instrumented with strain gauges to measure the force produced by the shaking. These forces were used to develop engineering guidelines for the design of the reinforcement needed for these systems. In addition, an analytical discrete element model was created using 3DEC software. The blocks were designed as rigid blocks, assigning all the properties to the joints including also the contribution of the interlocking shear key between blocks. The model is calibrated to the experimental static tests and validated with the obtained results from the dynamic tests. Then the model can be used to scale up the results to the full scale structure and expanding it to different configurations and boundary conditions.

Keywords: arch, discrete element model, seismic assessment, shake-table testing

Procedia PDF Downloads 198
24820 Effect of Salicylic Acid and Nitrogen Fertilizer on Wheat Growth and Yield

Authors: Omar Ibrahim, Aly A. Gaafar, K. A. Ratib

Abstract:

Two field experiments in micro plots were carried out during the winter seasons of 2012/2013 and 2013/2014, Soil Salinity Laboratory, Alexandria, Egypt, to study the effect of three levels of salicylic acid (SA) as a growth regulator (0, 50, 100 ppm) and three rates of nitrogen fertilizer (75, 100, 125 kg N/feddan) on growth and yield of a spring wheat (Giza 168). The experimental design was a split plot with the main plots in randomized complete block design (RCBD) and four replicates. The results indicated that increasing nitrogen fertilizer rates resulted in insignificant effect on both plant height (cm) and grain weight/spike only. However, a significant effect was observed in all the other studied characters due to the increase in nitrogen fertilizer. On the other hand, increasing salicylic acid rates resulted in insignificant effect in all the studied characters except for chlorophyll a, chlorophyll b, number of grain/spike, and grain yield (gm/ plot). The highest effects on grain yield in wheat were obtained by the rate of 125 kg/feddan of nitrogen fertilizer and 100 ppm of salicylic acid. In conclusion, the data indicated that a high grain yield could be obtained by adding 100 kg/feddan of nitrogen fertilizer and spraying of 50 ppm of salicylic acid with no significant difference with the highest rates. Finally, the interaction had no significant effect on all the studied characters.

Keywords: growth regulator, nitrogen fertilizer, spring wheat, salicylic acid

Procedia PDF Downloads 108
24819 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 32
24818 Data Access, AI Intensity, and Scale Advantages

Authors: Chuping Lo

Abstract:

This paper presents a simple model demonstrating that ceteris paribus countries with lower barriers to accessing global data tend to earn higher incomes than other countries. Therefore, large countries that inherently have greater data resources tend to have higher incomes than smaller countries, such that the former may be more hesitant than the latter to liberalize cross-border data flows to maintain this advantage. Furthermore, countries with higher artificial intelligence (AI) intensity in production technologies tend to benefit more from economies of scale in data aggregation, leading to higher income and more trade as they are better able to utilize global data.

Keywords: digital intensity, digital divide, international trade, scale of economics

Procedia PDF Downloads 54
24817 Secured Transmission and Reserving Space in Images Before Encryption to Embed Data

Authors: G. R. Navaneesh, E. Nagarajan, C. H. Rajam Raju

Abstract:

Nowadays the multimedia data are used to store some secure information. All previous methods allocate a space in image for data embedding purpose after encryption. In this paper, we propose a novel method by reserving space in image with a boundary surrounded before encryption with a traditional RDH algorithm, which makes it easy for the data hider to reversibly embed data in the encrypted images. The proposed method can achieve real time performance, that is, data extraction and image recovery are free of any error. A secure transmission process is also discussed in this paper, which improves the efficiency by ten times compared to other processes as discussed.

Keywords: secure communication, reserving room before encryption, least significant bits, image encryption, reversible data hiding

Procedia PDF Downloads 402
24816 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 241
24815 A Review on Intelligent Systems for Geoscience

Authors: R Palson Kennedy, P.Kiran Sai

Abstract:

This article introduces machine learning (ML) researchers to the hurdles that geoscience problems present, as well as the opportunities for improvement in both ML and geosciences. This article presents a review from the data life cycle perspective to meet that need. Numerous facets of geosciences present unique difficulties for the study of intelligent systems. Geosciences data is notoriously difficult to analyze since it is frequently unpredictable, intermittent, sparse, multi-resolution, and multi-scale. The first half addresses data science’s essential concepts and theoretical underpinnings, while the second section contains key themes and sharing experiences from current publications focused on each stage of the data life cycle. Finally, themes such as open science, smart data, and team science are considered.

Keywords: Data science, intelligent system, machine learning, big data, data life cycle, recent development, geo science

Procedia PDF Downloads 124
24814 An Alternative Proof for the Topological Entropy of the Motzkin Shift

Authors: Fahad Alsharari, Mohd Salmi Md. Noorani

Abstract:

A Motzkin shift is a mathematical model for constraints on genetic sequences. In terms of the theory of symbolic dynamics, the Motzkin shift is nonsofic, and therefore, we cannot use the Perron-Frobenius theory to calculate its topological entropy. The Motzkin shift M(M,N) which comes from language theory, is defined to be the shift system over an alphabet A that consists of N negative symbols, N positive symbols and M neutral symbols. For an x in the full shift AZ, x is in M(M,N) if and only if every finite block appearing in x has a non-zero reduced form. Therefore, the constraint for x cannot be bounded in length. K. Inoue has shown that the entropy of the Motzkin shift M(M,N) is log(M + N + 1). In this paper, we find a new method of calculating the topological entropy of the Motzkin shift M(M,N) without any measure theoretical discussion.

Keywords: entropy, Motzkin shift, mathematical model, theory

Procedia PDF Downloads 466
24813 Cooperative CDD scheme Based on Adaptive Modulation in Wireless Communiation System

Authors: Seung-Jun Yu, Hwan-Jun Choi, Hyoung-Kyu Song

Abstract:

Among spatial diversity scheme, orthogonal space-time block code (OSTBC) and cyclic delay diversity (CDD) have been widely studied for the cooperative wireless relaying system. However, conventional OSTBC and CDD cannot cope with change in the number of relays owing to low throughput or error performance. In this paper, we propose a cooperative cyclic delay diversity (CDD) scheme that use hierarchical modulation at the source and adaptive modulation based on cyclic redundancy check (CRC) code at the relays.

Keywords: adaptive modulation, cooperative communication, CDD, OSTBC

Procedia PDF Downloads 417
24812 Linear Codes Afforded by the Permutation Representations of Finite Simple Groups and Their Support Designs

Authors: Amin Saeidi

Abstract:

Using a representation-theoretic approach and considering G to be a finite primitive permutation group of degree n, our aim is to determine linear codes of length n that admit G as a permutation automorphism group. We can show that in some cases, every binary linear code admitting G as a permutation automorphism group is a submodule of a permutation module defined by a primitive action of G. As an illustration of the method, we consider the sporadic simple group M₁₁ and the unitary group U(3,3). We also construct some point- and block-primitive 1-designs from the supports of some codewords of the codes in the discussion.

Keywords: linear code, permutation representation, support design, simple group

Procedia PDF Downloads 68
24811 Adaptive Multiple Transforms Hardware Architecture for Versatile Video Coding

Authors: T. Damak, S. Houidi, M. A. Ben Ayed, N. Masmoudi

Abstract:

The Versatile Video Coding standard (VVC) is actually under development by the Joint Video Exploration Team (or JVET). An Adaptive Multiple Transforms (AMT) approach was announced. It is based on different transform modules that provided an efficient coding. However, the AMT solution raises several issues especially regarding the complexity of the selected set of transforms. This can be an important issue, particularly for a future industrial adoption. This paper proposed an efficient hardware implementation of the most used transform in AMT approach: the DCT II. The developed circuit is adapted to different block sizes and can reach a minimum frequency of 192 MHz allowing an optimized execution time.

Keywords: adaptive multiple transforms, AMT, DCT II, hardware, transform, versatile video coding, VVC

Procedia PDF Downloads 134
24810 Data Quality as a Pillar of Data-Driven Organizations: Exploring the Benefits of Data Mesh

Authors: Marc Bachelet, Abhijit Kumar Chatterjee, José Manuel Avila

Abstract:

Data quality is a key component of any data-driven organization. Without data quality, organizations cannot effectively make data-driven decisions, which often leads to poor business performance. Therefore, it is important for an organization to ensure that the data they use is of high quality. This is where the concept of data mesh comes in. Data mesh is an organizational and architectural decentralized approach to data management that can help organizations improve the quality of data. The concept of data mesh was first introduced in 2020. Its purpose is to decentralize data ownership, making it easier for domain experts to manage the data. This can help organizations improve data quality by reducing the reliance on centralized data teams and allowing domain experts to take charge of their data. This paper intends to discuss how a set of elements, including data mesh, are tools capable of increasing data quality. One of the key benefits of data mesh is improved metadata management. In a traditional data architecture, metadata management is typically centralized, which can lead to data silos and poor data quality. With data mesh, metadata is managed in a decentralized manner, ensuring accurate and up-to-date metadata, thereby improving data quality. Another benefit of data mesh is the clarification of roles and responsibilities. In a traditional data architecture, data teams are responsible for managing all aspects of data, which can lead to confusion and ambiguity in responsibilities. With data mesh, domain experts are responsible for managing their own data, which can help provide clarity in roles and responsibilities and improve data quality. Additionally, data mesh can also contribute to a new form of organization that is more agile and adaptable. By decentralizing data ownership, organizations can respond more quickly to changes in their business environment, which in turn can help improve overall performance by allowing better insights into business as an effect of better reports and visualization tools. Monitoring and analytics are also important aspects of data quality. With data mesh, monitoring, and analytics are decentralized, allowing domain experts to monitor and analyze their own data. This will help in identifying and addressing data quality problems in quick time, leading to improved data quality. Data culture is another major aspect of data quality. With data mesh, domain experts are encouraged to take ownership of their data, which can help create a data-driven culture within the organization. This can lead to improved data quality and better business outcomes. Finally, the paper explores the contribution of AI in the coming years. AI can help enhance data quality by automating many data-related tasks, like data cleaning and data validation. By integrating AI into data mesh, organizations can further enhance the quality of their data. The concepts mentioned above are illustrated by AEKIDEN experience feedback. AEKIDEN is an international data-driven consultancy that has successfully implemented a data mesh approach. By sharing their experience, AEKIDEN can help other organizations understand the benefits and challenges of implementing data mesh and improving data quality.

Keywords: data culture, data-driven organization, data mesh, data quality for business success

Procedia PDF Downloads 121
24809 Big Data Analysis with RHadoop

Authors: Ji Eun Shin, Byung Ho Jung, Dong Hoon Lim

Abstract:

It is almost impossible to store or analyze big data increasing exponentially with traditional technologies. Hadoop is a new technology to make that possible. R programming language is by far the most popular statistical tool for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with different sizes of actual data. Experimental results showed our RHadoop system was much faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and big lm packages available on big memory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.

Keywords: big data, Hadoop, parallel regression analysis, R, RHadoop

Procedia PDF Downloads 423