Search results for: georeferenced data
25090 Rural Entrepreneurship as a Response to Climate Change and Resource Conservation
Authors: Omar Romero-Hernandez, Federico Castillo, Armando Sanchez, Sergio Romero, Andrea Romero, Michael Mitchell
Abstract:
Environmental policies for resource conservation in rural areas include subsidies on services and social programs to cover living expenses. Government's expectation is that rural communities who benefit from social programs, such as payment for ecosystem services, are provided with an incentive to conserve natural resources and preserve natural sinks for greenhouse gases. At the same time, global climate change has affected the lives of people worldwide. The capability to adapt to global warming depends on the available resources and the standard of living, putting rural communities at a disadvantage. This paper explores whether rural entrepreneurship can represent a solution to resource conservation and global warming adaptation in rural communities. The research focuses on a sample of two coffee communities in Oaxaca, Mexico. Researchers used geospatial information contained in aerial photographs of the geographical areas of interest. Households were identified in the photos via the roofs of households and georeferenced via coordinates. From the household population, a random selection of roofs was performed and received a visit. A total of 112 surveys were completed, including questions of socio-demographics, perception to climate change and adaptation activities. The population includes two groups of study: entrepreneurs and non-entrepreneurs. Data was sorted, filtered, and validated. Analysis includes descriptive statistics for exploratory purposes and a multi-regression analysis. Outcomes from the surveys indicate that coffee farmers, who demonstrate entrepreneurship skills and hire employees, are more eager to adapt to climate change despite the extreme adverse socioeconomic conditions of the region. We show that farmers with entrepreneurial tendencies are more creative in using innovative farm practices such as the planting of shade trees, the use of live fencing, instead of wires, and watershed protection techniques, among others. This result counters the notion that small farmers are at the mercy of climate change and have no possibility of being able to adapt to a changing climate. The study also points to roadblocks that farmers face when coping with climate change. Among those roadblocks are a lack of extension services, access to credit, and reliable internet, all of which reduces access to vital information needed in today’s constantly changing world. Results indicate that, under some circumstances, funding and supporting entrepreneurship programs may provide more benefit than traditional social programs.Keywords: entrepreneurship, global warming, rural communities, climate change adaptation
Procedia PDF Downloads 23925089 How to Use Big Data in Logistics Issues
Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy
Abstract:
Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.Keywords: big data, logistics, operational efficiency, risk management
Procedia PDF Downloads 64125088 Implementation of an IoT Sensor Data Collection and Analysis Library
Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee
Abstract:
Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.Keywords: clustering, data mining, DBSCAN, k-means, k-medoids, sensor data
Procedia PDF Downloads 37825087 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles
Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis
Abstract:
Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review
Procedia PDF Downloads 16225086 Government Big Data Ecosystem: A Systematic Literature Review
Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis
Abstract:
Data that is high in volume, velocity, veracity and comes from a variety of sources is usually generated in all sectors including the government sector. Globally public administrations are pursuing (big) data as new technology and trying to adopt a data-centric architecture for hosting and sharing data. Properly executed, big data and data analytics in the government (big) data ecosystem can be led to data-driven government and have a direct impact on the way policymakers work and citizens interact with governments. In this research paper, we conduct a systematic literature review. The main aims of this paper are to highlight essential aspects of the government (big) data ecosystem and to explore the most critical socio-technical factors that contribute to the successful implementation of government (big) data ecosystem. The essential aspects of government (big) data ecosystem include definition, data types, data lifecycle models, and actors and their roles. We also discuss the potential impact of (big) data in public administration and gaps in the government data ecosystems literature. As this is a new topic, we did not find specific articles on government (big) data ecosystem and therefore focused our research on various relevant areas like humanitarian data, open government data, scientific research data, industry data, etc.Keywords: applications of big data, big data, big data types. big data ecosystem, critical success factors, data-driven government, egovernment, gaps in data ecosystems, government (big) data, literature review, public administration, systematic review
Procedia PDF Downloads 22825085 A Machine Learning Decision Support Framework for Industrial Engineering Purposes
Authors: Anli Du Preez, James Bekker
Abstract:
Data is currently one of the most critical and influential emerging technologies. However, the true potential of data is yet to be exploited since, currently, about 1% of generated data are ever actually analyzed for value creation. There is a data gap where data is not explored due to the lack of data analytics infrastructure and the required data analytics skills. This study developed a decision support framework for data analytics by following Jabareen’s framework development methodology. The study focused on machine learning algorithms, which is a subset of data analytics. The developed framework is designed to assist data analysts with little experience, in choosing the appropriate machine learning algorithm given the purpose of their application.Keywords: Data analytics, Industrial engineering, Machine learning, Value creation
Procedia PDF Downloads 16825084 Providing Security to Private Cloud Using Advanced Encryption Standard Algorithm
Authors: Annapureddy Srikant Reddy, Atthanti Mahendra, Samala Chinni Krishna, N. Neelima
Abstract:
In our present world, we are generating a lot of data and we, need a specific device to store all these data. Generally, we store data in pen drives, hard drives, etc. Sometimes we may loss the data due to the corruption of devices. To overcome all these issues, we implemented a cloud space for storing the data, and it provides more security to the data. We can access the data with just using the internet from anywhere in the world. We implemented all these with the java using Net beans IDE. Once user uploads the data, he does not have any rights to change the data. Users uploaded files are stored in the cloud with the file name as system time and the directory will be created with some random words. Cloud accepts the data only if the size of the file is less than 2MB.Keywords: cloud space, AES, FTP, NetBeans IDE
Procedia PDF Downloads 20625083 Business Intelligence for Profiling of Telecommunication Customer
Authors: Rokhmatul Insani, Hira Laksmiwati Soemitro
Abstract:
Business Intelligence is a methodology that exploits the data to produce information and knowledge systematically, business intelligence can support the decision-making process. Some methods in business intelligence are data warehouse and data mining. A data warehouse can store historical data from transactional data. For data modelling in data warehouse, we apply dimensional modelling by Kimball. While data mining is used to extracting patterns from the data and get insight from the data. Data mining has many techniques, one of which is segmentation. For profiling of telecommunication customer, we use customer segmentation according to customer’s usage of services, customer invoice and customer payment. Customers can be grouped according to their characteristics and can be identified the profitable customers. We apply K-Means Clustering Algorithm for segmentation. The input variable for that algorithm we use RFM (Recency, Frequency and Monetary) model. All process in data mining, we use tools IBM SPSS modeller.Keywords: business intelligence, customer segmentation, data warehouse, data mining
Procedia PDF Downloads 48325082 Imputation Technique for Feature Selection in Microarray Data Set
Authors: Younies Saeed Hassan Mahmoud, Mai Mabrouk, Elsayed Sallam
Abstract:
Analysing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.Keywords: DNA microarray, feature selection, missing data, bioinformatics
Procedia PDF Downloads 57425081 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework
Authors: Lutful Karim, Mohammed S. Al-kahtani
Abstract:
Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.Keywords: big data, clustering, tree topology, data aggregation, sensor networks
Procedia PDF Downloads 34525080 Automated Feature Extraction and Object-Based Detection from High-Resolution Aerial Photos Based on Machine Learning and Artificial Intelligence
Authors: Mohammed Al Sulaimani, Hamad Al Manhi
Abstract:
With the development of Remote Sensing technology, the resolution of optical Remote Sensing images has greatly improved, and images have become largely available. Numerous detectors have been developed for detecting different types of objects. In the past few years, Remote Sensing has benefited a lot from deep learning, particularly Deep Convolution Neural Networks (CNNs). Deep learning holds great promise to fulfill the challenging needs of Remote Sensing and solving various problems within different fields and applications. The use of Unmanned Aerial Systems in acquiring Aerial Photos has become highly used and preferred by most organizations to support their activities because of their high resolution and accuracy, which make the identification and detection of very small features much easier than Satellite Images. And this has opened an extreme era of Deep Learning in different applications not only in feature extraction and prediction but also in analysis. This work addresses the capacity of Machine Learning and Deep Learning in detecting and extracting Oil Leaks from Flowlines (Onshore) using High-Resolution Aerial Photos which have been acquired by UAS fixed with RGB Sensor to support early detection of these leaks and prevent the company from the leak’s losses and the most important thing environmental damage. Here, there are two different approaches and different methods of DL have been demonstrated. The first approach focuses on detecting the Oil Leaks from the RAW Aerial Photos (not processed) using a Deep Learning called Single Shoot Detector (SSD). The model draws bounding boxes around the leaks, and the results were extremely good. The second approach focuses on detecting the Oil Leaks from the Ortho-mosaiced Images (Georeferenced Images) by developing three Deep Learning Models using (MaskRCNN, U-Net and PSP-Net Classifier). Then, post-processing is performed to combine the results of these three Deep Learning Models to achieve a better detection result and improved accuracy. Although there is a relatively small amount of datasets available for training purposes, the Trained DL Models have shown good results in extracting the extent of the Oil Leaks and obtaining excellent and accurate detection.Keywords: GIS, remote sensing, oil leak detection, machine learning, aerial photos, unmanned aerial systems
Procedia PDF Downloads 3225079 Control the Flow of Big Data
Authors: Shizra Waris, Saleem Akhtar
Abstract:
Big data is a research area receiving attention from academia and IT communities. In the digital world, the amounts of data produced and stored have within a short period of time. Consequently this fast increasing rate of data has created many challenges. In this paper, we use functionalism and structuralism paradigms to analyze the genesis of big data applications and its current trends. This paper presents a complete discussion on state-of-the-art big data technologies based on group and stream data processing. Moreover, strengths and weaknesses of these technologies are analyzed. This study also covers big data analytics techniques, processing methods, some reported case studies from different vendor, several open research challenges and the chances brought about by big data. The similarities and differences of these techniques and technologies based on important limitations are also investigated. Emerging technologies are suggested as a solution for big data problems.Keywords: computer, it community, industry, big data
Procedia PDF Downloads 19425078 High Performance Computing and Big Data Analytics
Authors: Branci Sarra, Branci Saadia
Abstract:
Because of the multiplied data growth, many computer science tools have been developed to process and analyze these Big Data. High-performance computing architectures have been designed to meet the treatment needs of Big Data (view transaction processing standpoint, strategic, and tactical analytics). The purpose of this article is to provide a historical and global perspective on the recent trend of high-performance computing architectures especially what has a relation with Analytics and Data Mining.Keywords: high performance computing, HPC, big data, data analysis
Procedia PDF Downloads 52025077 A Landscape of Research Data Repositories in Re3data.org Registry: A Case Study of Indian Repositories
Authors: Prashant Shrivastava
Abstract:
The purpose of this study is to explore re3dat.org registry to identify research data repositories registration workflow process. Further objective is to depict a graph for present development of research data repositories in India. Preliminarily with an approach to understand re3data.org registry framework and schema design then further proceed to explore the status of research data repositories of India in re3data.org registry. Research data repositories are getting wider relevance due to e-research concepts. Now available registry re3data.org is a good tool for users and researchers to identify appropriate research data repositories as per their research requirements. In Indian environment, a compatible National Research Data Policy is the need of the time to boost the management of research data. Registry for Research Data Repositories is a crucial tool to discover specific information in specific domain. Also, Research Data Repositories in India have not been studied. Re3data.org registry and status of Indian research data repositories both discussed in this study.Keywords: research data, research data repositories, research data registry, re3data.org
Procedia PDF Downloads 32425076 A Study of Cloud Computing Solution for Transportation Big Data Processing
Authors: Ilgin Gökaşar, Saman Ghaffarian
Abstract:
The need for fast processed big data of transportation ridership (eg., smartcard data) and traffic operation (e.g., traffic detectors data) which requires a lot of computational power is incontrovertible in Intelligent Transportation Systems. Nowadays cloud computing is one of the important subjects and popular information technology solution for data processing. It enables users to process enormous measure of data without having their own particular computing power. Thus, it can also be a good selection for transportation big data processing as well. This paper intends to examine how the cloud computing can enhance transportation big data process with contrasting its advantages and disadvantages, and discussing cloud computing features.Keywords: big data, cloud computing, Intelligent Transportation Systems, ITS, traffic data processing
Procedia PDF Downloads 46725075 Harmonic Data Preparation for Clustering and Classification
Authors: Ali Asheibi
Abstract:
The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.Keywords: data mining, harmonic data, clustering, classification
Procedia PDF Downloads 24725074 Index of Suitability for Culex pipiens sl. Mosquitoes in Portugal Mainland
Authors: Maria C. Proença, Maria T. Rebelo, Marília Antunes, Maria J. Alves, Hugo Osório, Sofia Cunha, REVIVE team
Abstract:
The environment of the mosquitoes complex Culex pipiens sl. in Portugal mainland is evaluated based in its abundance, using a data set georeferenced, collected during seven years (2006-2012) from May to October. The suitability of the different regions can be delineated using the relative abundance areas; the suitablility index is directly proportional to disease transmission risk and allows focusing mitigation measures in order to avoid outbreaks of vector-borne diseases. The interest in the Culex pipiens complex is justified by its medical importance: the females bite all warm-blooded vertebrates and are involved in the circulation of several arbovirus of concern to human health, like West Nile virus, iridoviruses, rheoviruses and parvoviruses. The abundance of Culex pipiens mosquitoes were documented systematically all over the territory by the local health services, in a long duration program running since 2006. The environmental factors used to characterize the vector habitat are land use/land cover, distance to cartographed water bodies, altitude and latitude. Focus will be on the mosquito females, which gonotrophic cycle mate-bloodmeal-oviposition is responsible for the virus transmission; its abundance is the key for the planning of non-aggressive prophylactic countermeasures that may eradicate the transmission risk and simultaneously avoid chemical ambient degradation. Meteorological parameters such as: air relative humidity, air temperature (minima, maxima and mean daily temperatures) and daily total rainfall were gathered from the weather stations network for the same dates and crossed with the standardized females’ abundance in a geographic information system (GIS). Mean capture and percentage of above average captures related to each variable are used as criteria to compute a threshold for each meteorological parameter; the difference of the mean capture above/below the threshold was statistically assessed. The meteorological parameters measured at the net of weather stations all over the country are averaged by month and interpolated to produce raster maps that can be segmented according to the meaningful thresholds for each parameter. The intersection of the maps of all the parameters obtained for each month show the evolution of the suitable meteorological conditions through the mosquito season, considered as May to October, although the first and last month are less relevant. In parallel, mean and above average captures were related to the physiographic parameters – the land use/land cover classes most relevant in each month, the altitudes preferred and the most frequent distance to water bodies, a factor closely related with the mosquito biology. The maps produced with these results were crossed with the meteorological maps previously segmented, in order to get an index of suitability for the complex Culex pipiens evaluated all over the country, and its evolution from the beginning to the end of the mosquitoes season.Keywords: suitability index, Culex pipiens, habitat evolution, GIS model
Procedia PDF Downloads 57625073 Linguistic Summarization of Structured Patent Data
Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay
Abstract:
Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.Keywords: data mining, fuzzy sets, linguistic summarization, patent data
Procedia PDF Downloads 27225072 Proposal of Data Collection from Probes
Authors: M. Kebisek, L. Spendla, M. Kopcek, T. Skulavik
Abstract:
In our paper we describe the security capabilities of data collection. Data are collected with probes located in the near and distant surroundings of the company. Considering the numerous obstacles e.g. forests, hills, urban areas, the data collection is realized in several ways. The collection of data uses connection via wireless communication, LAN network, GSM network and in certain areas data are collected by using vehicles. In order to ensure the connection to the server most of the probes have ability to communicate in several ways. Collected data are archived and subsequently used in supervisory applications. To ensure the collection of the required data, it is necessary to propose algorithms that will allow the probes to select suitable communication channel.Keywords: communication, computer network, data collection, probe
Procedia PDF Downloads 36025071 A Review on Big Data Movement with Different Approaches
Authors: Nay Myo Sandar
Abstract:
With the growth of technologies and applications, a large amount of data has been producing at increasing rate from various resources such as social media networks, sensor devices, and other information serving devices. This large collection of massive, complex and exponential growth of dataset is called big data. The traditional database systems cannot store and process such data due to large and complexity. Consequently, cloud computing is a potential solution for data storage and processing since it can provide a pool of resources for servers and storage. However, moving large amount of data to and from is a challenging issue since it can encounter a high latency due to large data size. With respect to big data movement problem, this paper reviews the literature of previous works, discusses about research issues, finds out approaches for dealing with big data movement problem.Keywords: Big Data, Cloud Computing, Big Data Movement, Network Techniques
Procedia PDF Downloads 8625070 Optimized Approach for Secure Data Sharing in Distributed Database
Authors: Ahmed Mateen, Zhu Qingsheng, Ahmad Bilal
Abstract:
In the current age of technology, information is the most precious asset of a company. Today, companies have a large amount of data. As the data become larger, access to data for some particular information is becoming slower day by day. Faster data processing to shape it in the form of information is the biggest issue. The major problems in distributed databases are the efficiency of data distribution and response time of data distribution. The security of data distribution is also a big issue. For these problems, we proposed a strategy that can maximize the efficiency of data distribution and also increase its response time. This technique gives better results for secure data distribution from multiple heterogeneous sources. The newly proposed technique facilitates the companies for secure data sharing efficiently and quickly.Keywords: ER-schema, electronic record, P2P framework, API, query formulation
Procedia PDF Downloads 33325069 Data Mining Algorithms Analysis: Case Study of Price Predictions of Lands
Authors: Julio Albuja, David Zaldumbide
Abstract:
Data analysis is an important step before taking a decision about money. The aim of this work is to analyze the factors that influence the final price of the houses through data mining algorithms. To our best knowledge, previous work was researched just to compare results. Furthermore, before using the data of the data set, the Z-Transformation were used to standardize the data in the same range. Hence, the data was classified into two groups to visualize them in a readability format. A decision tree was built, and graphical data is displayed where clearly is easy to see the results and the factors' influence in these graphics. The definitions of these methods are described, as well as the descriptions of the results. Finally, conclusions and recommendations are presented related to the released results that our research showed making it easier to apply these algorithms using a customized data set.Keywords: algorithms, data, decision tree, transformation
Procedia PDF Downloads 37425068 Application of Blockchain Technology in Geological Field
Authors: Mengdi Zhang, Zhenji Gao, Ning Kang, Rongmei Liu
Abstract:
Management and application of geological big data is an important part of China's national big data strategy. With the implementation of a national big data strategy, geological big data management becomes more and more critical. At present, there are still a lot of technology barriers as well as cognition chaos in many aspects of geological big data management and application, such as data sharing, intellectual property protection, and application technology. Therefore, it’s a key task to make better use of new technologies for deeper delving and wider application of geological big data. In this paper, we briefly introduce the basic principle of blockchain technology at the beginning and then make an analysis of the application dilemma of geological data. Based on the current analysis, we bring forward some feasible patterns and scenarios for the blockchain application in geological big data and put forward serval suggestions for future work in geological big data management.Keywords: blockchain, intellectual property protection, geological data, big data management
Procedia PDF Downloads 8825067 Frequent Item Set Mining for Big Data Using MapReduce Framework
Authors: Tamanna Jethava, Rahul Joshi
Abstract:
Frequent Item sets play an essential role in many data Mining tasks that try to find interesting patterns from the database. Typically it refers to a set of items that frequently appear together in transaction dataset. There are several mining algorithm being used for frequent item set mining, yet most do not scale to the type of data we presented with today, so called “BIG DATA”. Big Data is a collection of large data sets. Our approach is to work on the frequent item set mining over the large dataset with scalable and speedy way. Big Data basically works with Map Reduce along with HDFS is used to find out frequent item sets from Big Data on large cluster. This paper focuses on using pre-processing & mining algorithm as hybrid approach for big data over Hadoop platform.Keywords: frequent item set mining, big data, Hadoop, MapReduce
Procedia PDF Downloads 43425066 The Role Of Data Gathering In NGOs
Authors: Hussaini Garba Mohammed
Abstract:
Background/Significance: The lack of data gathering is affecting NGOs world-wide in general to have good data information about educational and health related issues among communities in any country and around the world. For example, HIV/AIDS smoking (Tuberculosis diseases) and COVID-19 virus carriers is becoming a serious public health problem, especially among old men and women. But there is no full details data survey assessment from communities, villages, and rural area in some countries to show the percentage of victims and patients, especial with this world COVID-19 virus among the people. These data are essential to inform programming targets, strategies, and priorities in getting good information about data gathering in any society.Keywords: reliable information, data assessment, data mining, data communication
Procedia PDF Downloads 17925065 The Application of Data Mining Technology in Building Energy Consumption Data Analysis
Authors: Liang Zhao, Jili Zhang, Chongquan Zhong
Abstract:
Energy consumption data, in particular those involving public buildings, are impacted by many factors: the building structure, climate/environmental parameters, construction, system operating condition, and user behavior patterns. Traditional methods for data analysis are insufficient. This paper delves into the data mining technology to determine its application in the analysis of building energy consumption data including energy consumption prediction, fault diagnosis, and optimal operation. Recent literature are reviewed and summarized, the problems faced by data mining technology in the area of energy consumption data analysis are enumerated, and research points for future studies are given.Keywords: data mining, data analysis, prediction, optimization, building operational performance
Procedia PDF Downloads 85225064 To Handle Data-Driven Software Development Projects Effectively
Authors: Shahnewaz Khan
Abstract:
Machine learning (ML) techniques are often used in projects for creating data-driven applications. These tasks typically demand additional research and analysis. The proper technique and strategy must be chosen to ensure the success of data-driven projects. Otherwise, even exerting a lot of effort, the necessary development might not always be possible. In this post, an effort to examine the workflow of data-driven software development projects and its implementation process in order to describe how to manage a project successfully. Which will assist in minimizing the added workload.Keywords: data, data-driven projects, data science, NLP, software project
Procedia PDF Downloads 8325063 The Relationship Between Artificial Intelligence, Data Science, and Privacy
Authors: M. Naidoo
Abstract:
Artificial intelligence often requires large amounts of good quality data. Within important fields, such as healthcare, the training of AI systems predominately relies on health and personal data; however, the usage of this data is complicated by various layers of law and ethics that seek to protect individuals’ privacy rights. This research seeks to establish the challenges AI and data sciences pose to (i) informational rights, (ii) privacy rights, and (iii) data protection. To solve some of the issues presented, various methods are suggested, such as embedding values in technological development, proper balancing of rights and interests, and others.Keywords: artificial intelligence, data science, law, policy
Procedia PDF Downloads 10625062 Simulation Data Summarization Based on Spatial Histograms
Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura
Abstract:
In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.Keywords: simulation data, data summarization, spatial histograms, exploration, visualization
Procedia PDF Downloads 17625061 Algorithms used in Spatial Data Mining GIS
Authors: Vahid Bairami Rad
Abstract:
Extracting knowledge from spatial data like GIS data is important to reduce the data and extract information. Therefore, the development of new techniques and tools that support the human in transforming data into useful knowledge has been the focus of the relatively new and interdisciplinary research area ‘knowledge discovery in databases’. Thus, we introduce a set of database primitives or basic operations for spatial data mining which are sufficient to express most of the spatial data mining algorithms from the literature. This approach has several advantages. Similar to the relational standard language SQL, the use of standard primitives will speed-up the development of new data mining algorithms and will also make them more portable. We introduced a database-oriented framework for spatial data mining which is based on the concepts of neighborhood graphs and paths. A small set of basic operations on these graphs and paths were defined as database primitives for spatial data mining. Furthermore, techniques to efficiently support the database primitives by a commercial DBMS were presented.Keywords: spatial data base, knowledge discovery database, data mining, spatial relationship, predictive data mining
Procedia PDF Downloads 460