Search results for: ignorable missing data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24431

Search results for: ignorable missing data

24251 Transport Emission Inventories and Medical Exposure Modeling: A Missing Link for Urban Health

Authors: Frederik Schulte, Stefan Voß

Abstract:

The adverse effects of air pollution on public health are an increasingly vital problem in planning for urban regions in many parts of the world. The issue is addressed from various angles and by distinct disciplines in research. Epidemiological studies model the relative increase of numerous diseases in response to an increment of different forms of air pollution. A significant share of air pollution in urban regions is related to transport emissions that are often measured and stored in emission inventories. Though, most approaches in transport planning, engineering, and operational design of transport activities are restricted to general emission limits for specific air pollutants and do not consider more nuanced exposure models. We conduct an extensive literature review on exposure models and emission inventories used to study the health impact of transport emissions. Furthermore, we review methods applied in both domains and use emission inventory data of transportation hubs such as ports, airports, and urban traffic for an in-depth analysis of public health impacts deploying medical exposure models. The results reveal specific urban health risks related to transport emissions that may improve urban planning for environmental health by providing insights in actual health effects instead of only referring to general emission limits.

Keywords: emission inventories, exposure models, transport emissions, urban health

Procedia PDF Downloads 363
24250 A Landscape of Research Data Repositories in Re3data.org Registry: A Case Study of Indian Repositories

Authors: Prashant Shrivastava

Abstract:

The purpose of this study is to explore re3dat.org registry to identify research data repositories registration workflow process. Further objective is to depict a graph for present development of research data repositories in India. Preliminarily with an approach to understand re3data.org registry framework and schema design then further proceed to explore the status of research data repositories of India in re3data.org registry. Research data repositories are getting wider relevance due to e-research concepts. Now available registry re3data.org is a good tool for users and researchers to identify appropriate research data repositories as per their research requirements. In Indian environment, a compatible National Research Data Policy is the need of the time to boost the management of research data. Registry for Research Data Repositories is a crucial tool to discover specific information in specific domain. Also, Research Data Repositories in India have not been studied. Re3data.org registry and status of Indian research data repositories both discussed in this study.

Keywords: research data, research data repositories, research data registry, re3data.org

Procedia PDF Downloads 299
24249 A Study of Cloud Computing Solution for Transportation Big Data Processing

Authors: Ilgin Gökaşar, Saman Ghaffarian

Abstract:

The need for fast processed big data of transportation ridership (eg., smartcard data) and traffic operation (e.g., traffic detectors data) which requires a lot of computational power is incontrovertible in Intelligent Transportation Systems. Nowadays cloud computing is one of the important subjects and popular information technology solution for data processing. It enables users to process enormous measure of data without having their own particular computing power. Thus, it can also be a good selection for transportation big data processing as well. This paper intends to examine how the cloud computing can enhance transportation big data process with contrasting its advantages and disadvantages, and discussing cloud computing features.

Keywords: big data, cloud computing, Intelligent Transportation Systems, ITS, traffic data processing

Procedia PDF Downloads 427
24248 Harmonic Data Preparation for Clustering and Classification

Authors: Ali Asheibi

Abstract:

The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.

Keywords: data mining, harmonic data, clustering, classification

Procedia PDF Downloads 222
24247 Linguistic Summarization of Structured Patent Data

Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay

Abstract:

Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.

Keywords: data mining, fuzzy sets, linguistic summarization, patent data

Procedia PDF Downloads 249
24246 Proposal of Data Collection from Probes

Authors: M. Kebisek, L. Spendla, M. Kopcek, T. Skulavik

Abstract:

In our paper we describe the security capabilities of data collection. Data are collected with probes located in the near and distant surroundings of the company. Considering the numerous obstacles e.g. forests, hills, urban areas, the data collection is realized in several ways. The collection of data uses connection via wireless communication, LAN network, GSM network and in certain areas data are collected by using vehicles. In order to ensure the connection to the server most of the probes have ability to communicate in several ways. Collected data are archived and subsequently used in supervisory applications. To ensure the collection of the required data, it is necessary to propose algorithms that will allow the probes to select suitable communication channel.

Keywords: communication, computer network, data collection, probe

Procedia PDF Downloads 334
24245 A Review on Big Data Movement with Different Approaches

Authors: Nay Myo Sandar

Abstract:

With the growth of technologies and applications, a large amount of data has been producing at increasing rate from various resources such as social media networks, sensor devices, and other information serving devices. This large collection of massive, complex and exponential growth of dataset is called big data. The traditional database systems cannot store and process such data due to large and complexity. Consequently, cloud computing is a potential solution for data storage and processing since it can provide a pool of resources for servers and storage. However, moving large amount of data to and from is a challenging issue since it can encounter a high latency due to large data size. With respect to big data movement problem, this paper reviews the literature of previous works, discusses about research issues, finds out approaches for dealing with big data movement problem.

Keywords: Big Data, Cloud Computing, Big Data Movement, Network Techniques

Procedia PDF Downloads 57
24244 Optimized Approach for Secure Data Sharing in Distributed Database

Authors: Ahmed Mateen, Zhu Qingsheng, Ahmad Bilal

Abstract:

In the current age of technology, information is the most precious asset of a company. Today, companies have a large amount of data. As the data become larger, access to data for some particular information is becoming slower day by day. Faster data processing to shape it in the form of information is the biggest issue. The major problems in distributed databases are the efficiency of data distribution and response time of data distribution. The security of data distribution is also a big issue. For these problems, we proposed a strategy that can maximize the efficiency of data distribution and also increase its response time. This technique gives better results for secure data distribution from multiple heterogeneous sources. The newly proposed technique facilitates the companies for secure data sharing efficiently and quickly.

Keywords: ER-schema, electronic record, P2P framework, API, query formulation

Procedia PDF Downloads 305
24243 Design, Development and Evaluation of a Portable Recording System to Capture Dynamic Presentations using the Teacher´s Tablet PC

Authors: Enrique Barra, Abel Carril, Aldo Gordillo, Joaquin Salvachua, Juan Quemada

Abstract:

Computers and multimedia equipment have improved a lot in the last years. They have reduced costs and size while at the same time has increased their capabilities. These improvements allowed us to design and implement a portable recording system that also integrates the teacher´s tablet PC to capture what he/she writes on the slides and all that happens in it. This paper explains this system in detail and the validation of the recordings that we did after using it to record all the lectures of a course in our university called “Communications Software”. The results show that pupils used the recordings for different purposes and consider them useful for a variety of things, especially after missing a lecture.

Keywords: recording system, capture dynamic presentations, lecture recording

Procedia PDF Downloads 339
24242 Calculating All Dark Energy and Dark Matter Effects through Dynamic Gravity Theory

Authors: Sean Michael Kinney

Abstract:

In 1666, Newton created the Law of Universal Gravitation. And in 1915, Einstein improved it to incorporate factors such as time dilation and gravitational lensing. But currently, there is a problem with this “universal” law. The math doesn’t work outside the confines of our solar system. And something is missing; any evidence of what gravity actually is and how it manifests. This paper explores the notion that gravity must obey the law of conservation of energy as all other forces in this universe have been shown to do. Explaining exactly what gravity is and how it manifests itself. And looking at many different implications that would be created are explained. And finally, use the math of Dynamic gravity to calculate Dark Energy and Dark Matter effects to explain all observations without the need for exotic measures.

Keywords: dynamic gravity, gravity, dark matter, dark energy

Procedia PDF Downloads 48
24241 Data Mining Algorithms Analysis: Case Study of Price Predictions of Lands

Authors: Julio Albuja, David Zaldumbide

Abstract:

Data analysis is an important step before taking a decision about money. The aim of this work is to analyze the factors that influence the final price of the houses through data mining algorithms. To our best knowledge, previous work was researched just to compare results. Furthermore, before using the data of the data set, the Z-Transformation were used to standardize the data in the same range. Hence, the data was classified into two groups to visualize them in a readability format. A decision tree was built, and graphical data is displayed where clearly is easy to see the results and the factors' influence in these graphics. The definitions of these methods are described, as well as the descriptions of the results. Finally, conclusions and recommendations are presented related to the released results that our research showed making it easier to apply these algorithms using a customized data set.

Keywords: algorithms, data, decision tree, transformation

Procedia PDF Downloads 348
24240 Evaluation of the Impact of Neuropathic Pain on the Quality of Life of Patients

Authors: A. Ibovi Mouondayi, S. Zaher, R. Assadi, K. Erraoui, S. Sboul, J. Daoudim, S. Bousselham, K. Nassar, S. Janani

Abstract:

Introduction: Neuropathic pain (NP) is chronic pain; it can be observed in a large number of clinical situations. This pain results from a lesion of the peripheral or central nervous system. It is a frequent reason for consultations in rheumatology. This pain being chronic, can become disabling for the patient, thereby altering his quality of life. Objective: The objective of this study was to evaluate the impact of neuropathic pain on the quality of life of patients followed-up for chronic neuropathic pain. Material and Method: This is a monocentric, cross-sectional, descriptive, retrospective study conducted in our department over a period of 19 months from October 2020 to April 2022. The missing parameters were collected during phone calls of the patients concerned. The diagnostic tool adopted was the DN4 questionnaire in the dialectal Arabic version. The impact of NP was assessed by the visual analog scale (VAS) on pain, sleep, and function. The impact of PN on mood was assessed by the hospital anxiety, and depression scale (HAD) score in the validated Arabic version. The exclusion criteria were patients followed up for depression and other psychiatric pathologies. Results: A total of 1528 patient data were collected; the average age of the patients was 57 years (standard deviation: 13 years) with extremes ranging from 17 years to 94 years, 91% were women and 9% men with a sex ratio man/woman equal to 0.10. 67% of our patients were married, and 63% of our patients were housewives. 43% of patients were followed-up for degenerative pathology. The NP was cervical radiculopathy in 26%, lumbosacral radiculopathy in 51%, and carpal tunnel syndrome in 20%. 23% of our patients had poor sleep quality, and 54% had average sleep quality. The pain was very intense in 5% of patients; 33% had severe pain, and 58% had moderate pain. The function was limited in 55% of patients. The average HAD score for anxiety and depression was 4.39 (standard deviation: 2.77) and 3.21 (standard deviation: 2.89), respectively. Conclusion: Our data clearly illustrate that neuropathic pain has a negative impact on the quality of sleep and function, as well as the mood of patients, thus influencing their quality of life.

Keywords: neuropathic pain, sleep, quality of life, chronic pain

Procedia PDF Downloads 106
24239 Application of Blockchain Technology in Geological Field

Authors: Mengdi Zhang, Zhenji Gao, Ning Kang, Rongmei Liu

Abstract:

Management and application of geological big data is an important part of China's national big data strategy. With the implementation of a national big data strategy, geological big data management becomes more and more critical. At present, there are still a lot of technology barriers as well as cognition chaos in many aspects of geological big data management and application, such as data sharing, intellectual property protection, and application technology. Therefore, it’s a key task to make better use of new technologies for deeper delving and wider application of geological big data. In this paper, we briefly introduce the basic principle of blockchain technology at the beginning and then make an analysis of the application dilemma of geological data. Based on the current analysis, we bring forward some feasible patterns and scenarios for the blockchain application in geological big data and put forward serval suggestions for future work in geological big data management.

Keywords: blockchain, intellectual property protection, geological data, big data management

Procedia PDF Downloads 57
24238 Everyday-Life Vocabulary: A Missing Component in Iranian EFL Context

Authors: Yasser Aminifard, Hamdollah Askari

Abstract:

This study aimed at investigating any difference between Iranian senior high school students' performance on Academic Words (AWs) and Everyday-Life Words (ELWs). To this end, in the first phase, a number of 120 male senior high school students were randomly selected from among twelve high schools in Gachsaran to serve as the participants of the study. In the second phase, using purposive sampling, six high school teachers holding an MA in TEFL and with over twenty years of teaching experience were interviewed. Two multiple-choice tests, each comprising 40 items, were given to the participants in order to determine their performance on AWs and ELWs and follow-up semi-structured interviews were conducted to explore teachers' opinions about participants' performance on the two tests. To analyze the data, a paired-samples t-test was carried out to compare the results of both tests and the interviews were also transcribed to pinpoint important themes. The results of the t-test indicated that the participants performed significantly better on AWs than on ELWs. Additionally, results of the interviews boiled down to the fact that the English textbooks designed for Iranian high school students are fundamentally flawed on the grounds that there is a mismatch between students' real language learning needs and what is presented to them as "teaching-to-the-test" materials via these books. Finally, the implications and suggestions for further research are discussed.

Keywords: everyday-life words, academic words, textbooks, washback

Procedia PDF Downloads 434
24237 Frequent Item Set Mining for Big Data Using MapReduce Framework

Authors: Tamanna Jethava, Rahul Joshi

Abstract:

Frequent Item sets play an essential role in many data Mining tasks that try to find interesting patterns from the database. Typically it refers to a set of items that frequently appear together in transaction dataset. There are several mining algorithm being used for frequent item set mining, yet most do not scale to the type of data we presented with today, so called “BIG DATA”. Big Data is a collection of large data sets. Our approach is to work on the frequent item set mining over the large dataset with scalable and speedy way. Big Data basically works with Map Reduce along with HDFS is used to find out frequent item sets from Big Data on large cluster. This paper focuses on using pre-processing & mining algorithm as hybrid approach for big data over Hadoop platform.

Keywords: frequent item set mining, big data, Hadoop, MapReduce

Procedia PDF Downloads 393
24236 Trend of Foot and Mouth Disease and Adopted Control Measures in Limpopo Province during the Period 2014 to 2020

Authors: Temosho Promise Chuene, T. Chitura

Abstract:

Background: Foot and mouth disease is a real challenge in South Africa. The disease is a serious threat to the viability of livestock farming initiatives and affects local and international livestock trade. In Limpopo Province, the Kruger National Park and other game reserves are home to the African buffalo (Syncerus caffer), a notorious reservoir of the picornavirus, which causes foot and mouth disease. Out of the virus’s seven (7) distinct serotypes, Southern African Territories (SAT) 1, 2, and 3 are commonly endemic in South Africa. The broad objective of the study was to establish the trend of foot and mouth disease in Limpopo Province over a seven-year period (2014-2020), as well as the adoption and comprehensive reporting of the measures that are taken to contain disease outbreaks in the study area. Methods: The study used secondary data from the World Organization for Animal Health (WOAH) on reported cases of foot and mouth disease in South Africa. Descriptive analysis (frequencies and percentages) and Analysis of variance (ANOVA) were used to present and analyse the data. Result: The year 2020 had the highest prevalence of foot and mouth disease (3.72%), while 2016 had the lowest prevalence (0.05%). Serotype SAT 2 was the most endemic, followed by SAT 1. Findings from the study demonstrated the seasonal nature of foot and mouth disease in the study area, as most disease cases were reported in the summer seasons. Slaughter of diseased and at-risk animals was the only documented disease control strategy, and information was missing for some of the years. Conclusion: The study identified serious underreporting of the adopted control strategies following disease outbreaks. Adoption of comprehensive disease control strategies coupled with thorough reporting can help to reduce outbreaks of foot and mouth disease and prevent losses to the livestock farming sector of South Africa and Limpopo Province in particular.

Keywords: livestock farming, African buffalo, prevalence, serotype, slaughter

Procedia PDF Downloads 37
24235 The Role Of Data Gathering In NGOs

Authors: Hussaini Garba Mohammed

Abstract:

Background/Significance: The lack of data gathering is affecting NGOs world-wide in general to have good data information about educational and health related issues among communities in any country and around the world. For example, HIV/AIDS smoking (Tuberculosis diseases) and COVID-19 virus carriers is becoming a serious public health problem, especially among old men and women. But there is no full details data survey assessment from communities, villages, and rural area in some countries to show the percentage of victims and patients, especial with this world COVID-19 virus among the people. These data are essential to inform programming targets, strategies, and priorities in getting good information about data gathering in any society.

Keywords: reliable information, data assessment, data mining, data communication

Procedia PDF Downloads 158
24234 The Application of Data Mining Technology in Building Energy Consumption Data Analysis

Authors: Liang Zhao, Jili Zhang, Chongquan Zhong

Abstract:

Energy consumption data, in particular those involving public buildings, are impacted by many factors: the building structure, climate/environmental parameters, construction, system operating condition, and user behavior patterns. Traditional methods for data analysis are insufficient. This paper delves into the data mining technology to determine its application in the analysis of building energy consumption data including energy consumption prediction, fault diagnosis, and optimal operation. Recent literature are reviewed and summarized, the problems faced by data mining technology in the area of energy consumption data analysis are enumerated, and research points for future studies are given.

Keywords: data mining, data analysis, prediction, optimization, building operational performance

Procedia PDF Downloads 822
24233 To Handle Data-Driven Software Development Projects Effectively

Authors: Shahnewaz Khan

Abstract:

Machine learning (ML) techniques are often used in projects for creating data-driven applications. These tasks typically demand additional research and analysis. The proper technique and strategy must be chosen to ensure the success of data-driven projects. Otherwise, even exerting a lot of effort, the necessary development might not always be possible. In this post, an effort to examine the workflow of data-driven software development projects and its implementation process in order to describe how to manage a project successfully. Which will assist in minimizing the added workload.

Keywords: data, data-driven projects, data science, NLP, software project

Procedia PDF Downloads 57
24232 On the Optimization of a Decentralized Photovoltaic System

Authors: Zaouche Khelil, Talha Abdelaziz, Berkouk El Madjid

Abstract:

In this paper, we present a grid-tied photovoltaic system. The studied topology is structured around a seven-level inverter, supplying a non-linear load. A three-stage step-up DC/DC converter ensures DC-link balancing. The presented system allows the extraction of all the available photovoltaic power. This extracted energy feeds the local load; the surplus energy is injected into the electrical network. During poor weather conditions, where the photovoltaic panels cannot meet the energy needs of the load, the missing power is supplied by the electrical network. At the common connexion point, the network current shows excellent spectral performances.

Keywords: seven-level inverter, multi-level DC/DC converter, photovoltaic, non-linear load

Procedia PDF Downloads 156
24231 Improving Data Completeness and Timely Reporting: A Joint Collaborative Effort between Partners in Health and Ministry of Health in Remote Areas, Neno District, Malawi

Authors: Wiseman Emmanuel Nkhomah, Chiyembekezo Kachimanga, Moses Banda Aron, Julia Higgins, Manuel Mulwafu, Kondwani Mpinga, Mwayi Chunga, Grace Momba, Enock Ndarama, Dickson Sumphi, Atupere Phiri, Fabien Munyaneza

Abstract:

Background: Data is key to supporting health service delivery as stakeholders, including NGOs rely on it for effective service delivery, decision-making, and system strengthening. Several studies generated debate on data quality from national health management information systems (HMIS) in sub-Saharan Africa. This limits the utilization of data in resource-limited settings, which already struggle to meet standards set by the World Health Organization (WHO). We aimed to evaluate data quality improvement of Neno district HMIS over a 4-year period (2018 – 2021) following quarterly data reviews introduced in January 2020 by the district health management team and Partners In Health. Methods: Exploratory Mixed Research was used to examine report rates, followed by in-depth interviews using Key Informant Interviews (KIIs) and Focus Group Discussions (FGDs). We used the WHO module desk review to assess the quality of HMIS data in the Neno district captured from 2018 to 2021. The metrics assessed included the completeness and timeliness of 34 reports. Completeness was measured as a percentage of non-missing reports. Timeliness was measured as the span between data inputs and expected outputs meeting needs. We computed T-Test and recorded P-values, summaries, and percentage changes using R and Excel 2016. We analyzed demographics for key informant interviews in Power BI. We developed themes from 7 FGDs and 11 KIIs using Dedoose software, from which we picked perceptions of healthcare workers, interventions implemented, and improvement suggestions. The study was reviewed and approved by Malawi National Health Science Research Committee (IRB: 22/02/2866). Results: Overall, the average reporting completeness rate was 83.4% (before) and 98.1% (after), while timeliness was 68.1% and 76.4 respectively. Completeness of reports increased over time: 2018, 78.8%; 2019, 88%; 2020, 96.3% and 2021, 99.9% (p< 0.004). The trend for timeliness has been declining except in 2021, where it improved: 2018, 68.4%; 2019, 68.3%; 2020, 67.1% and 2021, 81% (p< 0.279). Comparing 2021 reporting rates to the mean of three preceding years, both completeness increased from 88% to 99% (in 2021), while timeliness increased from 68% to 81%. Sixty-five percent of reports have maintained meeting a national standard of 90%+ in completeness while only 24% in timeliness. Thirty-two percent of reports met the national standard. Only 9% improved on both completeness and timeliness, and these are; cervical cancer, nutrition care support and treatment, and youth-friendly health services reports. 50% of reports did not improve to standard in timeliness, and only one did not in completeness. On the other hand, factors associated with improvement included improved communications and reminders using internal communication, data quality assessments, checks, and reviews. Decentralizing data entry at the facility level was suggested to improve timeliness. Conclusion: Findings suggest that data quality in HMIS for the district has improved following collaborative efforts. We recommend maintaining such initiatives to identify remaining quality gaps and that results be shared publicly to support increased use of data. These results can inform Ministry of Health and its partners on some interventions and advise initiatives for improving its quality.

Keywords: data quality, data utilization, HMIS, collaboration, completeness, timeliness, decision-making

Procedia PDF Downloads 54
24230 The Relationship Between Artificial Intelligence, Data Science, and Privacy

Authors: M. Naidoo

Abstract:

Artificial intelligence often requires large amounts of good quality data. Within important fields, such as healthcare, the training of AI systems predominately relies on health and personal data; however, the usage of this data is complicated by various layers of law and ethics that seek to protect individuals’ privacy rights. This research seeks to establish the challenges AI and data sciences pose to (i) informational rights, (ii) privacy rights, and (iii) data protection. To solve some of the issues presented, various methods are suggested, such as embedding values in technological development, proper balancing of rights and interests, and others.

Keywords: artificial intelligence, data science, law, policy

Procedia PDF Downloads 86
24229 Simulation Data Summarization Based on Spatial Histograms

Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura

Abstract:

In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.

Keywords: simulation data, data summarization, spatial histograms, exploration, visualization

Procedia PDF Downloads 156
24228 Algorithms used in Spatial Data Mining GIS

Authors: Vahid Bairami Rad

Abstract:

Extracting knowledge from spatial data like GIS data is important to reduce the data and extract information. Therefore, the development of new techniques and tools that support the human in transforming data into useful knowledge has been the focus of the relatively new and interdisciplinary research area ‘knowledge discovery in databases’. Thus, we introduce a set of database primitives or basic operations for spatial data mining which are sufficient to express most of the spatial data mining algorithms from the literature. This approach has several advantages. Similar to the relational standard language SQL, the use of standard primitives will speed-up the development of new data mining algorithms and will also make them more portable. We introduced a database-oriented framework for spatial data mining which is based on the concepts of neighborhood graphs and paths. A small set of basic operations on these graphs and paths were defined as database primitives for spatial data mining. Furthermore, techniques to efficiently support the database primitives by a commercial DBMS were presented.

Keywords: spatial data base, knowledge discovery database, data mining, spatial relationship, predictive data mining

Procedia PDF Downloads 428
24227 Data Stream Association Rule Mining with Cloud Computing

Authors: B. Suraj Aravind, M. H. M. Krishna Prasad

Abstract:

There exist emerging applications of data streams that require association rule mining, such as network traffic monitoring, web click streams analysis, sensor data, data from satellites etc. Data streams typically arrive continuously in high speed with huge amount and changing data distribution. This raises new issues that need to be considered when developing association rule mining techniques for stream data. This paper proposes to introduce an improved data stream association rule mining algorithm by eliminating the limitation of resources. For this, the concept of cloud computing is used. Inclusion of this may lead to additional unknown problems which needs further research.

Keywords: data stream, association rule mining, cloud computing, frequent itemsets

Procedia PDF Downloads 473
24226 “Chasing Hope”: Parents’ Perspectives on Complementary and Alternative Interventions for Autism Spectrum Disorder Children in Kazakhstan

Authors: Sofiya An, Akbota Kanderzhanova, Assel Akhmetova, Faye Foster, Chee K. Chan

Abstract:

Healthcare, education and social support for children with autism in Kazakhstan has been evolving and transforming over the last three decades. There is still limited knowledge of the use of complementary and alternative medicine by families caring for autistic children in this post-Soviet region. An exploratory qualitative focus group study of Kazakhstani families was carried out to capture and understand their experiences of using complementary and alternative (CAM) medicine. A total of six focus groups were conducted in five cities across the country including Nur-Sultan, Almaty, Kyzylorda, Karaganda and Taraz. The perceived factors driving the availability, choice, and use of complementary and alternative medicine by families of autistic children in the country were distilled and evaluated. The data collected was analyzed using a framework analysis and themes and subthemes were developed. Two major themes stood out. The first was the “unmet needs”, which relates to the predisposing factors that motivate parents to CAM uptake, and the second was the “chasing hope”, which relates to the enabling factors that facilitate parents’ uptake of CAM. Fear of missing out (FOMO) is a latent underlying motivation underscoring these two themes as well. Parents of autism spectrum disorder (ASD) children in Kazakhstan have to deal with many challenges when seeking treatment for their children with ASD. They are prepared and resort to try out whatever CAM interventions available. The motivation and rationale of choice of use is driven by the lack of options and the hope of any potential positive outcome rather than from rational decisions based on efficacy or the evidence-based data of CAM. Parents get desperate and are willing to try CAM regardless of and independent of their cultural and belief systems and they do not want to miss out just in case it might work. This study also gives an international and cross-cultural perspective on the motives, choice and practice of parents with ASD children using CAM in Kazakhstan, a Central Asian country.

Keywords: autism spectrum disorder, Central Asia, complementary and alternative medicine, cross-cultural perspective, qualitative research

Procedia PDF Downloads 119
24225 A Comprehensive Survey and Improvement to Existing Privacy Preserving Data Mining Techniques

Authors: Tosin Ige

Abstract:

Ethics must be a condition of the world, like logic. (Ludwig Wittgenstein, 1889-1951). As important as data mining is, it possess a significant threat to ethics, privacy, and legality, since data mining makes it difficult for an individual or consumer (in the case of a company) to control the accessibility and usage of his data. This research focuses on Current issues and the latest research and development on Privacy preserving data mining methods as at year 2022. It also discusses some advances in those techniques while at the same time highlighting and providing a new technique as a solution to an existing technique of privacy preserving data mining methods. This paper also bridges the wide gap between Data mining and the Web Application Programing Interface (web API), where research is urgently needed for an added layer of security in data mining while at the same time introducing a seamless and more efficient way of data mining.

Keywords: data, privacy, data mining, association rule, privacy preserving, mining technique

Procedia PDF Downloads 131
24224 Big Data: Concepts, Technologies and Applications in the Public Sector

Authors: A. Alexandru, C. A. Alexandru, D. Coardos, E. Tudora

Abstract:

Big Data (BD) is associated with a new generation of technologies and architectures which can harness the value of extremely large volumes of very varied data through real time processing and analysis. It involves changes in (1) data types, (2) accumulation speed, and (3) data volume. This paper presents the main concepts related to the BD paradigm, and introduces architectures and technologies for BD and BD sets. The integration of BD with the Hadoop Framework is also underlined. BD has attracted a lot of attention in the public sector due to the newly emerging technologies that allow the availability of network access. The volume of different types of data has exponentially increased. Some applications of BD in the public sector in Romania are briefly presented.

Keywords: big data, big data analytics, Hadoop, cloud

Procedia PDF Downloads 282
24223 Semantic Data Schema Recognition

Authors: Aïcha Ben Salem, Faouzi Boufares, Sebastiao Correia

Abstract:

The subject covered in this paper aims at assisting the user in its quality approach. The goal is to better extract, mix, interpret and reuse data. It deals with the semantic schema recognition of a data source. This enables the extraction of data semantics from all the available information, inculding the data and the metadata. Firstly, it consists of categorizing the data by assigning it to a category and possibly a sub-category, and secondly, of establishing relations between columns and possibly discovering the semantics of the manipulated data source. These links detected between columns offer a better understanding of the source and the alternatives for correcting data. This approach allows automatic detection of a large number of syntactic and semantic anomalies.

Keywords: schema recognition, semantic data profiling, meta-categorisation, semantic dependencies inter columns

Procedia PDF Downloads 395
24222 Modeling Stream Flow with Prediction Uncertainty by Using SWAT Hydrologic and RBNN Neural Network Models for Agricultural Watershed in India

Authors: Ajai Singh

Abstract:

Simulation of hydrological processes at the watershed outlet through modelling approach is essential for proper planning and implementation of appropriate soil conservation measures in Damodar Barakar catchment, Hazaribagh, India where soil erosion is a dominant problem. This study quantifies the parametric uncertainty involved in simulation of stream flow using Soil and Water Assessment Tool (SWAT), a watershed scale model and Radial Basis Neural Network (RBNN), an artificial neural network model. Both the models were calibrated and validated based on measured stream flow and quantification of the uncertainty in SWAT model output was assessed using ‘‘Sequential Uncertainty Fitting Algorithm’’ (SUFI-2). Though both the model predicted satisfactorily, but RBNN model performed better than SWAT with R2 and NSE values of 0.92 and 0.92 during training, and 0.71 and 0.70 during validation period, respectively. Comparison of the results of the two models also indicates a wider prediction interval for the results of the SWAT model. The values of P-factor related to each model shows that the percentage of observed stream flow values bracketed by the 95PPU in the RBNN model as 91% is higher than the P-factor in SWAT as 87%. In other words the RBNN model estimates the stream flow values more accurately and with less uncertainty. It could be stated that RBNN model based on simple input could be used for estimation of monthly stream flow, missing data, and testing the accuracy and performance of other models.

Keywords: SWAT, RBNN, SUFI 2, bootstrap technique, stream flow, simulation

Procedia PDF Downloads 329