Search results for: data portability
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24195

Search results for: data portability

24135 The Incidence of Prostate Cancer in Previous Infected E. Coli Population

Authors: Andreea Molnar, Amalia Ardeljan, Lexi Frankel, Marissa Dallara, Brittany Nagel, Omar Rashid

Abstract:

Background: Escherichia coli is a gram-negative, facultative anaerobic bacteria that belongs to the family Enterobacteriaceae and resides in the intestinal tracts of individuals. E.Coli has numerous strains grouped into serogroups and serotypes based on differences in antigens in their cell walls (somatic, or “O” antigens) and flagella (“H” antigens). More than 700 serotypes of E. coli have been identified. Although most strains of E. coli are harmless, a few strains, such as E. coli O157:H7 which produces Shiga toxin, can cause intestinal infection with symptoms of severe abdominal cramps, bloody diarrhea, and vomiting. Infection with E. Coli can lead to the development of systemic inflammation as the toxin exerts its effects. Chronic inflammation is now known to contribute to cancer development in several organs, including the prostate. The purpose of this study was to evaluate the correlation between E. Coli and the incidence of prostate cancer. Methods: Data collected in this cohort study was provided by a Health Insurance Portability and Accountability Act (HIPAA) compliant national database to evaluate patients infected with E.Coli infection and prostate cancer using the International Classification of Disease (ICD-10 and ICD-9 codes). Permission to use the database was granted by Holy Cross Health, Fort Lauderdale for the purpose of academic research. Data analysis was conducted through the use of standard statistical methods. Results: Between January 2010 and December 2019, the query was analyzed and resulted in 81, 037 patients after matching in both infected and control groups, respectively. The two groups were matched by Age Range and CCI score. The incidence of prostate cancer was 2.07% and 1,680 patients in the E. Coli group compared to 5.19% and 4,206 patients in the control group. The difference was statistically significant by a p-value p<2.2x10-16 with an Odds Ratio of 0.53 and a 95% CI. Based on the specific treatment for E.Coli, the infected group vs control group were matched again with a result of 31,696 patients in each group. 827 out of 31,696 (2.60%) patients with a prior E.coli infection and treated with antibiotics were compared to 1634 out of 31,696 (5.15%) patients with no history of E.coli infection (control) and received antibiotic treatment. Both populations subsequently developed prostate carcinoma. Results remained statistically significant (p<2.2x10-16), Odds Ratio=0.55 (95% CI 0.51-0.59). Conclusion: This retrospective study shows a statistically significant correlation between E.Coli infection and a decreased incidence of prostate cancer. Further evaluation is needed in order to identify the impact of E.Coli infection and prostate cancer development.

Keywords: E. Coli, prostate cancer, protective, microbiology

Procedia PDF Downloads 187
24134 Control the Flow of Big Data

Authors: Shizra Waris, Saleem Akhtar

Abstract:

Big data is a research area receiving attention from academia and IT communities. In the digital world, the amounts of data produced and stored have within a short period of time. Consequently this fast increasing rate of data has created many challenges. In this paper, we use functionalism and structuralism paradigms to analyze the genesis of big data applications and its current trends. This paper presents a complete discussion on state-of-the-art big data technologies based on group and stream data processing. Moreover, strengths and weaknesses of these technologies are analyzed. This study also covers big data analytics techniques, processing methods, some reported case studies from different vendor, several open research challenges and the chances brought about by big data. The similarities and differences of these techniques and technologies based on important limitations are also investigated. Emerging technologies are suggested as a solution for big data problems.

Keywords: computer, it community, industry, big data

Procedia PDF Downloads 159
24133 High Performance Computing and Big Data Analytics

Authors: Branci Sarra, Branci Saadia

Abstract:

Because of the multiplied data growth, many computer science tools have been developed to process and analyze these Big Data. High-performance computing architectures have been designed to meet the treatment needs of Big Data (view transaction processing standpoint, strategic, and tactical analytics). The purpose of this article is to provide a historical and global perspective on the recent trend of high-performance computing architectures especially what has a relation with Analytics and Data Mining.

Keywords: high performance computing, HPC, big data, data analysis

Procedia PDF Downloads 484
24132 A Landscape of Research Data Repositories in Re3data.org Registry: A Case Study of Indian Repositories

Authors: Prashant Shrivastava

Abstract:

The purpose of this study is to explore re3dat.org registry to identify research data repositories registration workflow process. Further objective is to depict a graph for present development of research data repositories in India. Preliminarily with an approach to understand re3data.org registry framework and schema design then further proceed to explore the status of research data repositories of India in re3data.org registry. Research data repositories are getting wider relevance due to e-research concepts. Now available registry re3data.org is a good tool for users and researchers to identify appropriate research data repositories as per their research requirements. In Indian environment, a compatible National Research Data Policy is the need of the time to boost the management of research data. Registry for Research Data Repositories is a crucial tool to discover specific information in specific domain. Also, Research Data Repositories in India have not been studied. Re3data.org registry and status of Indian research data repositories both discussed in this study.

Keywords: research data, research data repositories, research data registry, re3data.org

Procedia PDF Downloads 296
24131 A Study of Cloud Computing Solution for Transportation Big Data Processing

Authors: Ilgin Gökaşar, Saman Ghaffarian

Abstract:

The need for fast processed big data of transportation ridership (eg., smartcard data) and traffic operation (e.g., traffic detectors data) which requires a lot of computational power is incontrovertible in Intelligent Transportation Systems. Nowadays cloud computing is one of the important subjects and popular information technology solution for data processing. It enables users to process enormous measure of data without having their own particular computing power. Thus, it can also be a good selection for transportation big data processing as well. This paper intends to examine how the cloud computing can enhance transportation big data process with contrasting its advantages and disadvantages, and discussing cloud computing features.

Keywords: big data, cloud computing, Intelligent Transportation Systems, ITS, traffic data processing

Procedia PDF Downloads 423
24130 Harmonic Data Preparation for Clustering and Classification

Authors: Ali Asheibi

Abstract:

The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.

Keywords: data mining, harmonic data, clustering, classification

Procedia PDF Downloads 219
24129 Linguistic Summarization of Structured Patent Data

Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay

Abstract:

Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.

Keywords: data mining, fuzzy sets, linguistic summarization, patent data

Procedia PDF Downloads 246
24128 Proposal of Data Collection from Probes

Authors: M. Kebisek, L. Spendla, M. Kopcek, T. Skulavik

Abstract:

In our paper we describe the security capabilities of data collection. Data are collected with probes located in the near and distant surroundings of the company. Considering the numerous obstacles e.g. forests, hills, urban areas, the data collection is realized in several ways. The collection of data uses connection via wireless communication, LAN network, GSM network and in certain areas data are collected by using vehicles. In order to ensure the connection to the server most of the probes have ability to communicate in several ways. Collected data are archived and subsequently used in supervisory applications. To ensure the collection of the required data, it is necessary to propose algorithms that will allow the probes to select suitable communication channel.

Keywords: communication, computer network, data collection, probe

Procedia PDF Downloads 332
24127 A Review on Big Data Movement with Different Approaches

Authors: Nay Myo Sandar

Abstract:

With the growth of technologies and applications, a large amount of data has been producing at increasing rate from various resources such as social media networks, sensor devices, and other information serving devices. This large collection of massive, complex and exponential growth of dataset is called big data. The traditional database systems cannot store and process such data due to large and complexity. Consequently, cloud computing is a potential solution for data storage and processing since it can provide a pool of resources for servers and storage. However, moving large amount of data to and from is a challenging issue since it can encounter a high latency due to large data size. With respect to big data movement problem, this paper reviews the literature of previous works, discusses about research issues, finds out approaches for dealing with big data movement problem.

Keywords: Big Data, Cloud Computing, Big Data Movement, Network Techniques

Procedia PDF Downloads 53
24126 Laser Based Microfabrication of a Microheater Chip for Cell Culture

Authors: Daniel Nieto, Ramiro Couceiro

Abstract:

Microfluidic chips have demonstrated their significant application potentials in microbiological processing and chemical reactions, with the goal of developing monolithic and compact chip-sized multifunctional systems. Heat generation and thermal control are critical in some of the biochemical processes. The paper presents a laser direct-write technique for rapid prototyping and manufacturing of microheater chips and its applicability for perfusion cell culture outside a cell incubator. The aim of the microheater is to take the role of conventional incubators for cell culture for facilitating microscopic observation or other online monitoring activities during cell culture and provides portability of cell culture operation. Microheaters (5 mm × 5 mm) have been successfully fabricated on soda-lime glass substrates covered with aluminum layer of thickness 120 nm. Experimental results show that the microheaters exhibit good performance in temperature rise and decay characteristics, with localized heating at targeted spatial domains. These microheaters were suitable for a maximum long-term operation temperature of 120ºC and validated for long-time operation at 37ºC. for 24 hours. Results demonstrated that the physiology of the cultured SW480 adenocarcinoma of the colon cell line on the developed microheater chip was consistent with that of an incubator.

Keywords: laser microfabrication, microheater, bioengineering, cell culture

Procedia PDF Downloads 266
24125 Optimized Approach for Secure Data Sharing in Distributed Database

Authors: Ahmed Mateen, Zhu Qingsheng, Ahmad Bilal

Abstract:

In the current age of technology, information is the most precious asset of a company. Today, companies have a large amount of data. As the data become larger, access to data for some particular information is becoming slower day by day. Faster data processing to shape it in the form of information is the biggest issue. The major problems in distributed databases are the efficiency of data distribution and response time of data distribution. The security of data distribution is also a big issue. For these problems, we proposed a strategy that can maximize the efficiency of data distribution and also increase its response time. This technique gives better results for secure data distribution from multiple heterogeneous sources. The newly proposed technique facilitates the companies for secure data sharing efficiently and quickly.

Keywords: ER-schema, electronic record, P2P framework, API, query formulation

Procedia PDF Downloads 300
24124 Thermoelectric Cooler As A Heat Transfer Device For Thermal Conductivity Test

Authors: Abdul Murad Zainal Abidin, Azahar Mohd, Nor Idayu Arifin, Siti Nor Azila Khalid, Mohd Julzaha Zahari Mohamad Yusof

Abstract:

A thermoelectric cooler (TEC) is an electronic component that uses ‘peltier’ effect to create a temperature difference by transferring heat between two electrical junctions of two different types of materials. TEC can also be used for heating by reversing the electric current flow and even power generation. A heat flow meter (HFM) is an equipment for measuring thermal conductivity of building materials. During the test, water is used as heat transfer medium to cool the HFM. The existing re-circulating cooler in the market is very costly, and the alternative is to use piped tap water to extract heat from HFM. However, the tap water temperature is insufficiently low to enable heat transfer to take place. The operating temperature for isothermal plates in the HFM is 40°C with the range of ±0.02°C. When the temperature exceeds the operating range, the HFM stops working, and the test cannot be conducted. The aim of the research is to develop a low-cost but energy-efficient TEC prototype that enables heat transfer without compromising the function of the HFM. The objectives of the research are a) to identify potential of TEC as a cooling device by evaluating its cooling rate and b) to determine the amount of water savings using TEC compared to normal tap water. Four (4) peltier sets were used, with two (2) sets used as pre-cooler. The cooling water is re-circulated from the reservoir into HFM using a water pump. The thermal conductivity readings, the water flow rate, and the power consumption were measured while the HFM was operating. The measured data has shown decrease in average cooling temperature difference (ΔTave) of 2.42°C and average cooling rate of 0.031°C/min. The water savings accrued from using the TEC is projected to be 8,332.8 litres/year with the application of water re-circulation. The results suggest the prototype has achieved required objectives. Further research will include comparing the cooling rate of TEC prototype against conventional tap water and to optimize its design and performance in terms of size and portability. The possible application of the prototype could also be expanded to portable storage for medicine and beverages.

Keywords: energy efficiency, thermoelectric cooling, pre-cooling device, heat flow meter, sustainable technology, thermal conductivity

Procedia PDF Downloads 136
24123 Data Mining Algorithms Analysis: Case Study of Price Predictions of Lands

Authors: Julio Albuja, David Zaldumbide

Abstract:

Data analysis is an important step before taking a decision about money. The aim of this work is to analyze the factors that influence the final price of the houses through data mining algorithms. To our best knowledge, previous work was researched just to compare results. Furthermore, before using the data of the data set, the Z-Transformation were used to standardize the data in the same range. Hence, the data was classified into two groups to visualize them in a readability format. A decision tree was built, and graphical data is displayed where clearly is easy to see the results and the factors' influence in these graphics. The definitions of these methods are described, as well as the descriptions of the results. Finally, conclusions and recommendations are presented related to the released results that our research showed making it easier to apply these algorithms using a customized data set.

Keywords: algorithms, data, decision tree, transformation

Procedia PDF Downloads 345
24122 Application of Blockchain Technology in Geological Field

Authors: Mengdi Zhang, Zhenji Gao, Ning Kang, Rongmei Liu

Abstract:

Management and application of geological big data is an important part of China's national big data strategy. With the implementation of a national big data strategy, geological big data management becomes more and more critical. At present, there are still a lot of technology barriers as well as cognition chaos in many aspects of geological big data management and application, such as data sharing, intellectual property protection, and application technology. Therefore, it’s a key task to make better use of new technologies for deeper delving and wider application of geological big data. In this paper, we briefly introduce the basic principle of blockchain technology at the beginning and then make an analysis of the application dilemma of geological data. Based on the current analysis, we bring forward some feasible patterns and scenarios for the blockchain application in geological big data and put forward serval suggestions for future work in geological big data management.

Keywords: blockchain, intellectual property protection, geological data, big data management

Procedia PDF Downloads 52
24121 Incidence of Lymphoma and Gonorrhea Infection: A Retrospective Study

Authors: Diya Kohli, Amalia Ardeljan, Lexi Frankel, Jose Garcia, Lokesh Manjani, Omar Rashid

Abstract:

Gonorrhea is the second most common sexually transmitted disease (STDs) in the United States of America. Gonorrhea affects the urethra, rectum, or throat and the cervix in females. Lymphoma is a cancer of the immune network called the lymphatic system that includes the lymph nodes/glands, spleen, thymus gland, and bone marrow. Lymphoma can affect many organs in the body. When a lymphocyte develops a genetic mutation, it signals other cells into rapid proliferation that causes many mutated lymphocytes. Multiple studies have explored the incidence of cancer in people infected with STDs such as Gonorrhea. For instance, the studies conducted by Wang Y-C and Co., as well as Caini, S and Co. established a direct co-relationship between Gonorrhea infection and incidence of prostate cancer. We hypothesize that Gonorrhea infection also increases the incidence of Lymphoma in patients. This research study aimed to evaluate the correlation between Gonorrhea infection and the incidence of Lymphoma. The data for the research was provided by a Health Insurance Portability and Accountability Act (HIPAA) compliant national database. This database was utilized to evaluate patients infected with Gonorrhea versus the ones who were not infected to establish a correlation with the prevalence of Lymphoma using ICD-10 and ICD-9 codes. Access to the database was granted by the Holy Cross Health, Fort Lauderdale for academic research. Standard statistical methods were applied throughout. Between January 2010 and December 2019, the query was analyzed and resulted in 254 and 808 patients in both the infected and control group, respectively. The two groups were matched by Age Range and CCI score. The incidence of Lymphoma was 0.998% (254 patients out of 25455) in the Gonorrhea group (patients infected with Gonorrhea that was Lymphoma Positive) compared to 3.174% and 808 patients in the control group (Patients negative for Gonorrhea but with Lymphoma). This was statistically significant by a p-value < 2.210-16 with an OR= 0.431 (95% CI 0.381-0.487). The patients were then matched by antibiotic treatment to avoid treatment bias. The incidence of Lymphoma was 1.215% (82 patients out of 6,748) in the Gonorrhea group compared to 2.949% (199 patients out of 6748) in the control group. This was statistically significant by a p-value <5.410-10 with an OR= 0.468 (95% CI 0.367-0.596). The study shows a statistically significant correlation between Gonorrhea and a reduced incidence of Lymphoma. Further evaluation is recommended to assess the potential of Gonorrhea in reducing Lymphoma.

Keywords: gonorrhea, lymphoma, STDs, cancer, ICD

Procedia PDF Downloads 170
24120 Frequent Item Set Mining for Big Data Using MapReduce Framework

Authors: Tamanna Jethava, Rahul Joshi

Abstract:

Frequent Item sets play an essential role in many data Mining tasks that try to find interesting patterns from the database. Typically it refers to a set of items that frequently appear together in transaction dataset. There are several mining algorithm being used for frequent item set mining, yet most do not scale to the type of data we presented with today, so called “BIG DATA”. Big Data is a collection of large data sets. Our approach is to work on the frequent item set mining over the large dataset with scalable and speedy way. Big Data basically works with Map Reduce along with HDFS is used to find out frequent item sets from Big Data on large cluster. This paper focuses on using pre-processing & mining algorithm as hybrid approach for big data over Hadoop platform.

Keywords: frequent item set mining, big data, Hadoop, MapReduce

Procedia PDF Downloads 390
24119 The Role Of Data Gathering In NGOs

Authors: Hussaini Garba Mohammed

Abstract:

Background/Significance: The lack of data gathering is affecting NGOs world-wide in general to have good data information about educational and health related issues among communities in any country and around the world. For example, HIV/AIDS smoking (Tuberculosis diseases) and COVID-19 virus carriers is becoming a serious public health problem, especially among old men and women. But there is no full details data survey assessment from communities, villages, and rural area in some countries to show the percentage of victims and patients, especial with this world COVID-19 virus among the people. These data are essential to inform programming targets, strategies, and priorities in getting good information about data gathering in any society.

Keywords: reliable information, data assessment, data mining, data communication

Procedia PDF Downloads 155
24118 Design, Analysis and Construction of a 250vac 8amps Arc Welding Machine

Authors: Anthony Okechukwu Ifediniru, Austin Ikechukwu Gbasouzor, Isidore Uche Uju

Abstract:

This article is centered on the design, analysis, construction, and test of a locally made arc welding machine that operates on 250vac with 8 amp output taps ranging from 60vac to 250vac at a fixed frequency, which is of benefit to urban areas; while considering its cost-effectiveness, strength, portability, and mobility. The welding machine uses a power supply to create an electric arc between an electrode and the metal at the welding point. A current selector coil needed for current selection is connected to the primary winding. Electric power is supplied to the primary winding of its transformer and is transferred to the secondary winding by induction. The voltage and current output of the secondary winding are connected to the output terminal, which is used to carry out welding work. The output current of the machine ranges from 110amps for low current welding to 250amps for high current welding. The machine uses a step-down transformer configuration for stepping down the voltage in order to obtain a high current level for effective welding. The welder can adjust the output current within a certain range. This allows the welder to properly set the output current for the type of welding that is being performed. The constructed arc welding machine was tested by connecting the work piece to it. Since there was no shock or spark from the transformer’s laminated core and was successfully used to join metals, it confirmed and validated the design.

Keywords: AC current, arc welding machine, DC current, transformer, welds

Procedia PDF Downloads 152
24117 The Application of Data Mining Technology in Building Energy Consumption Data Analysis

Authors: Liang Zhao, Jili Zhang, Chongquan Zhong

Abstract:

Energy consumption data, in particular those involving public buildings, are impacted by many factors: the building structure, climate/environmental parameters, construction, system operating condition, and user behavior patterns. Traditional methods for data analysis are insufficient. This paper delves into the data mining technology to determine its application in the analysis of building energy consumption data including energy consumption prediction, fault diagnosis, and optimal operation. Recent literature are reviewed and summarized, the problems faced by data mining technology in the area of energy consumption data analysis are enumerated, and research points for future studies are given.

Keywords: data mining, data analysis, prediction, optimization, building operational performance

Procedia PDF Downloads 817
24116 Low Power CMOS Amplifier Design for Wearable Electrocardiogram Sensor

Authors: Ow Tze Weng, Suhaila Isaak, Yusmeeraz Yusof

Abstract:

The trend of health care screening devices in the world is increasingly towards the favor of portability and wearability, especially in the most common electrocardiogram (ECG) monitoring system. This is because these wearable screening devices are not restricting the patient’s freedom and daily activities. While the demand of low power and low cost biomedical system on chip (SoC) is increasing in exponential way, the front end ECG sensors are still suffering from flicker noise for low frequency cardiac signal acquisition, 50 Hz power line electromagnetic interference, and the large unstable input offsets due to the electrode-skin interface is not attached properly. In this paper, a high performance CMOS amplifier for ECG sensors that suitable for low power wearable cardiac screening is proposed. The amplifier adopts the highly stable folded cascode topology and later being implemented into RC feedback circuit for low frequency DC offset cancellation. By using 0.13 µm CMOS technology from Silterra, the simulation results show that this front end circuit can achieve a very low input referred noise of 1 pV/√Hz and high common mode rejection ratio (CMRR) of 174.05 dB. It also gives voltage gain of 75.45 dB with good power supply rejection ratio (PSSR) of 92.12 dB. The total power consumption is only 3 µW and thus suitable to be implemented with further signal processing and classification back end for low power biomedical SoC.

Keywords: CMOS, ECG, amplifier, low power

Procedia PDF Downloads 216
24115 To Handle Data-Driven Software Development Projects Effectively

Authors: Shahnewaz Khan

Abstract:

Machine learning (ML) techniques are often used in projects for creating data-driven applications. These tasks typically demand additional research and analysis. The proper technique and strategy must be chosen to ensure the success of data-driven projects. Otherwise, even exerting a lot of effort, the necessary development might not always be possible. In this post, an effort to examine the workflow of data-driven software development projects and its implementation process in order to describe how to manage a project successfully. Which will assist in minimizing the added workload.

Keywords: data, data-driven projects, data science, NLP, software project

Procedia PDF Downloads 54
24114 The Relationship Between Artificial Intelligence, Data Science, and Privacy

Authors: M. Naidoo

Abstract:

Artificial intelligence often requires large amounts of good quality data. Within important fields, such as healthcare, the training of AI systems predominately relies on health and personal data; however, the usage of this data is complicated by various layers of law and ethics that seek to protect individuals’ privacy rights. This research seeks to establish the challenges AI and data sciences pose to (i) informational rights, (ii) privacy rights, and (iii) data protection. To solve some of the issues presented, various methods are suggested, such as embedding values in technological development, proper balancing of rights and interests, and others.

Keywords: artificial intelligence, data science, law, policy

Procedia PDF Downloads 82
24113 Simulation Data Summarization Based on Spatial Histograms

Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura

Abstract:

In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.

Keywords: simulation data, data summarization, spatial histograms, exploration, visualization

Procedia PDF Downloads 155
24112 Evaluation of Surface Water and Groundwater Quality in Parts of Umunneochi Southeast, Nigeria

Authors: Joshua Chima Chizoba, Wisdom Izuchukwu Uzoma, Elizabeth Ifeyiwa Okoyeh

Abstract:

Water cannot be optimally used and sustained unless the quality is periodically assessed. The study area Umunneochi and environs are located in south eastern part of Nigeria. It stretches geographically from latitudes 50501N to 60000N and longitudes 70201E to 70301. The major geologic formations in the area include the Asu River group, Nkporo Shale, and Ajali Sandstone. The aim of this study is to evaluate the hydrochemical characteristics of surface and ground water sources in parts of Umunneochi and environs in order to establish portability of the water sources for drinking, domestic and irrigation purposes. A total of 15 samples were collected randomly from streams, springs and wells. The samples were analyzed for physicochemical parameters and heavy metals using handheld digital kits, photometer, titration method and Atomic Absorption Spectrophotometer (AAS) following acceptable standards. The obtained analytical data were interpreted, and results were compared with World Health Organization (WHO) standard. The concentration of pH, SO42-and Cl- range from 5.81 mg/l – 6.07 mg/l, 41.93 mg/l – 142.95 mg/l and 20.00 mg/l – 111 mg/l respectively, while Pb and Zn revealed a relative low mean concentration of 0.14 mg/l and 0.40 mg/l, which are all within (WHO) permissible limits except pH. About 27% of the samples are moderately hard. This is attributed to the mining activities in the areas. The abundance of cations and anions in the area are in the order of K+>Na+>Mg2+>Ca2+ and SO4->Cl->HCO3->NO3-, respectively. Chloride, bicarbonate, and nitrate are all within the permissible limits. 13.33% of the total samples contain Sulphate above the standard permissible limits. The values of calculated Water Quality Index (WQI) are less than 50 indicating excellent water. The predominant water-type in the study area is Na-Cl water type and mixed Ca-Mg-Cl water type based on the sample plots on the Piper diagram. The Sodium Absorption Ratio (SAR) calculations showed excellent water for consumption and also good water for irrigation purpose with low sodium and alkalinity ratio respectively. Government water projects are recommended in the area for sustainable domestic and agricultural water supply to ease the stress of water supply problems.

Keywords: groundwater, hydrochemical, physichochemical, water-type, sodium adsorption ratio

Procedia PDF Downloads 104
24111 Algorithms used in Spatial Data Mining GIS

Authors: Vahid Bairami Rad

Abstract:

Extracting knowledge from spatial data like GIS data is important to reduce the data and extract information. Therefore, the development of new techniques and tools that support the human in transforming data into useful knowledge has been the focus of the relatively new and interdisciplinary research area ‘knowledge discovery in databases’. Thus, we introduce a set of database primitives or basic operations for spatial data mining which are sufficient to express most of the spatial data mining algorithms from the literature. This approach has several advantages. Similar to the relational standard language SQL, the use of standard primitives will speed-up the development of new data mining algorithms and will also make them more portable. We introduced a database-oriented framework for spatial data mining which is based on the concepts of neighborhood graphs and paths. A small set of basic operations on these graphs and paths were defined as database primitives for spatial data mining. Furthermore, techniques to efficiently support the database primitives by a commercial DBMS were presented.

Keywords: spatial data base, knowledge discovery database, data mining, spatial relationship, predictive data mining

Procedia PDF Downloads 425
24110 Data Stream Association Rule Mining with Cloud Computing

Authors: B. Suraj Aravind, M. H. M. Krishna Prasad

Abstract:

There exist emerging applications of data streams that require association rule mining, such as network traffic monitoring, web click streams analysis, sensor data, data from satellites etc. Data streams typically arrive continuously in high speed with huge amount and changing data distribution. This raises new issues that need to be considered when developing association rule mining techniques for stream data. This paper proposes to introduce an improved data stream association rule mining algorithm by eliminating the limitation of resources. For this, the concept of cloud computing is used. Inclusion of this may lead to additional unknown problems which needs further research.

Keywords: data stream, association rule mining, cloud computing, frequent itemsets

Procedia PDF Downloads 470
24109 A Comprehensive Survey and Improvement to Existing Privacy Preserving Data Mining Techniques

Authors: Tosin Ige

Abstract:

Ethics must be a condition of the world, like logic. (Ludwig Wittgenstein, 1889-1951). As important as data mining is, it possess a significant threat to ethics, privacy, and legality, since data mining makes it difficult for an individual or consumer (in the case of a company) to control the accessibility and usage of his data. This research focuses on Current issues and the latest research and development on Privacy preserving data mining methods as at year 2022. It also discusses some advances in those techniques while at the same time highlighting and providing a new technique as a solution to an existing technique of privacy preserving data mining methods. This paper also bridges the wide gap between Data mining and the Web Application Programing Interface (web API), where research is urgently needed for an added layer of security in data mining while at the same time introducing a seamless and more efficient way of data mining.

Keywords: data, privacy, data mining, association rule, privacy preserving, mining technique

Procedia PDF Downloads 128
24108 Big Data: Concepts, Technologies and Applications in the Public Sector

Authors: A. Alexandru, C. A. Alexandru, D. Coardos, E. Tudora

Abstract:

Big Data (BD) is associated with a new generation of technologies and architectures which can harness the value of extremely large volumes of very varied data through real time processing and analysis. It involves changes in (1) data types, (2) accumulation speed, and (3) data volume. This paper presents the main concepts related to the BD paradigm, and introduces architectures and technologies for BD and BD sets. The integration of BD with the Hadoop Framework is also underlined. BD has attracted a lot of attention in the public sector due to the newly emerging technologies that allow the availability of network access. The volume of different types of data has exponentially increased. Some applications of BD in the public sector in Romania are briefly presented.

Keywords: big data, big data analytics, Hadoop, cloud

Procedia PDF Downloads 279
24107 Semantic Data Schema Recognition

Authors: Aïcha Ben Salem, Faouzi Boufares, Sebastiao Correia

Abstract:

The subject covered in this paper aims at assisting the user in its quality approach. The goal is to better extract, mix, interpret and reuse data. It deals with the semantic schema recognition of a data source. This enables the extraction of data semantics from all the available information, inculding the data and the metadata. Firstly, it consists of categorizing the data by assigning it to a category and possibly a sub-category, and secondly, of establishing relations between columns and possibly discovering the semantics of the manipulated data source. These links detected between columns offer a better understanding of the source and the alternatives for correcting data. This approach allows automatic detection of a large number of syntactic and semantic anomalies.

Keywords: schema recognition, semantic data profiling, meta-categorisation, semantic dependencies inter columns

Procedia PDF Downloads 393
24106 Access Control System for Big Data Application

Authors: Winfred Okoe Addy, Jean Jacques Dominique Beraud

Abstract:

Access control systems (ACs) are some of the most important components in safety areas. Inaccuracies of regulatory frameworks make personal policies and remedies more appropriate than standard models or protocols. This problem is exacerbated by the increasing complexity of software, such as integrated Big Data (BD) software for controlling large volumes of encrypted data and resources embedded in a dedicated BD production system. This paper proposes a general access control strategy system for the diffusion of Big Data domains since it is crucial to secure the data provided to data consumers (DC). We presented a general access control circulation strategy for the Big Data domain by describing the benefit of using designated access control for BD units and performance and taking into consideration the need for BD and AC system. We then presented a generic of Big Data access control system to improve the dissemination of Big Data.

Keywords: access control, security, Big Data, domain

Procedia PDF Downloads 106