Search results for: parcel data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7273

Search results for: parcel data

6943 Expanding the Evaluation Criteria for a Wind Turbine Performance

Authors: Ivan Balachin, Geanette Polanco, Jiang Xingliang, Hu Qin

Abstract:

The problem of global warming raised up interest towards renewable energy sources. To reduce cost of wind energy is a challenge. Before building of wind park conditions such as: average wind speed, direction, time for each wind, probability of icing, must be considered in the design phase. Operation values used on the setting of control systems also will depend on mentioned variables. Here it is proposed a procedure to be include in the evaluation of the performance of a wind turbine, based on the amplitude of wind changes, the number of changes and their duration. A generic study case based on actual data is presented. Data analysing techniques were applied to model the power required for yaw system based on amplitude and data amount of wind changes. A theoretical model between time, amplitude of wind changes and angular speed of nacelle rotation was identified.

Keywords: field data processing, regression determination, wind turbine performance, wind turbine placing, yaw system losses

Procedia PDF Downloads 351
6942 An Exhaustive All-Subsets Examination of Trade Theory on WTO Data

Authors: Masoud Charkhabi

Abstract:

We examine trade theory with this motivation. The full set of World Trade Organization data are organized into country-year pairs, each treated as a different entity. Topological Data Analysis reveals that among the 16 region and 240 region-year pairs there exists in fact a distinguishable group of region-period pairs. The generally accepted periods of shifts from dissimilar-dissimilar to similar-similar trade in goods among regions are examined from this new perspective. The period breaks are treated as cumulative and are flexible. This type of all-subsets analysis is motivated from computer science and is made possible with Lossy Compression and Graph Theory. The results question many patterns in similar-similar to dissimilar-dissimilar trade. They also show indications of economic shifts that only later become evident in other economic metrics.

Keywords: econometrics, globalization, network science, topological data, analysis, trade theory, visualization, world trade

Procedia PDF Downloads 343
6941 Modelling the Education Supply Chain with Network Data Envelopment Analysis

Authors: Sourour Ramzi, Claudia Sarrico

Abstract:

Little has been done on network DEA in education, and nobody has attempted to model the whole education supply chain using network DEA. As such the contribution of the present paper is to propose a model for measuring the efficiency of education supply chains using network DEA. First, we use a general survey of data envelopment analysis (DEA) to establish the emergent themes for research in DEA, and focus on the theme of Network DEA. Second, we use a survey on two-stage DEA models, and Network DEA to write a state of the art on Network DEA, particularly applied to supply chain management. Third, we use a survey on DEA applications to establish the most influential papers on DEA education applications, in order to establish the state of the art on applications of DEA in education, in general, and applications of DEA to education using network DEA, in particular. Finally, we propose a model for measuring the performance of education supply chains of different education systems (countries or states within a country, for instance). We then use this model on some empirical data.

Keywords: supply chain, education, data envelopment analysis, network DEA

Procedia PDF Downloads 346
6940 Online Shopping vs Privacy – Results of an Experimental Study

Authors: Andrzej Poszewiecki

Abstract:

The presented paper contributes to the experimental current of research on privacy. The question of privacy is being discussed at length at present, primarily among lawyers and politicians. However, the matter of privacy has been of interest for economists for some time as well. The valuation of privacy by people is of great importance now. This article is about how people valuate their privacy. An experimental method has been utilised in the conducted research – the survey was carried out among customers of an online store, and the studied issue was whether their readiness to sell their data (WTA) was different from the willingness to buy data back (WTP). The basic aim of this article is to analyse whether people shopping on the Internet differentiate their privacy depending on whether they protect or sell it. The achieved results indicate the presence of major differences in this respect, which do not always come up with the original expectations. The obtained results have supported the hypothesis that people are more willing to sell their data than to repurchase them. However, the hypothesis that the value of proposed remuneration affects the willingness to sell/buy back personal data (one’s privacy) has not been supported.

Keywords: privacy, experimental economics, behavioural economics, internet

Procedia PDF Downloads 265
6939 Static vs. Stream Mining Trajectories Similarity Measures

Authors: Musaab Riyadh, Norwati Mustapha, Dina Riyadh

Abstract:

Trajectory similarity can be defined as the cost of transforming one trajectory into another based on certain similarity method. It is the core of numerous mining tasks such as clustering, classification, and indexing. Various approaches have been suggested to measure similarity based on the geometric and dynamic properties of trajectory, the overlapping between trajectory segments, and the confined area between entire trajectories. In this article, an evaluation of these approaches has been done based on computational cost, usage memory, accuracy, and the amount of data which is needed in advance to determine its suitability to stream mining applications. The evaluation results show that the stream mining applications support similarity methods which have low computational cost and memory, single scan on data, and free of mathematical complexity due to the high-speed generation of data.

Keywords: global distance measure, local distance measure, semantic trajectory, spatial dimension, stream data mining

Procedia PDF Downloads 373
6938 Data and Spatial Analysis for Economy and Education of 28 E.U. Member-States for 2014

Authors: Alexiou Dimitra, Fragkaki Maria

Abstract:

The objective of the paper is the study of geographic, economic and educational variables and their contribution to determine the position of each member-state among the EU-28 countries based on the values of seven variables as given by Eurostat. The Data Analysis methods of Multiple Factorial Correspondence Analysis (MFCA) Principal Component Analysis and Factor Analysis have been used. The cross tabulation tables of data consist of the values of seven variables for the 28 countries for 2014. The data are manipulated using the CHIC Analysis V 1.1 software package. The results of this program using MFCA and Ascending Hierarchical Classification are given in arithmetic and graphical form. For comparison reasons with the same data the Factor procedure of Statistical package IBM SPSS 20 has been used. The numerical and graphical results presented with tables and graphs, demonstrate the agreement between the two methods. The most important result is the study of the relation between the 28 countries and the position of each country in groups or clouds, which are formed according to the values of the corresponding variables.

Keywords: Multiple Factorial Correspondence Analysis, Principal Component Analysis, Factor Analysis, E.U.-28 countries, Statistical package IBM SPSS 20, CHIC Analysis V 1.1 Software, Eurostat.eu Statistics

Procedia PDF Downloads 486
6937 The Optimal Utilization of Centrally Located Land: The Case of the Bloemfontein Show Grounds

Authors: D. F. Coetzee, M. M. Campbell

Abstract:

The urban environment is constantly expanding and the optimal use of centrally located land is important in terms of sustainable development. Bloemfontein has expanded and this affects land-use functions. The purpose of the study is to examine the possible shift in location of the Bloemfontein show grounds to utilize the space of the grounds more effectively in context of spatial planning. The research method used is qualitative case study research with the case study on the Bloemfontein show grounds. The purposive sample consisted of planners who work or consult in the Bloemfontein area and who are registered with the South African Council for Planners (SACPLAN). Interviews consisting of qualitative open-ended questionnaires were used. When considering relocation the social and economic aspects need to be considered. The findings also indicated a majority consensus that the property can be utilized more effectively in terms of mixed land use. The showground development trust compiled a master plan to ensure that the property is used to its full potential without the relocation of the showground function itself. This Master Plan can be seen as the next logical step for the showground property itself, and it is indeed an attempt to better utilize the land parcel without relocating the show function. The question arises whether the proposed Master Plan is a permanent solution or whether it is merely delaying the relocation of the core showground function to another location. For now, it is a sound solution, making the best out of the situation at hand and utilizing the property more effectively. If the show grounds were to be relocated the researcher proposed a recommendation of mixed-use development, in terms an expansion on the commercial business/retail, together with a sport and recreation function. The show grounds in Bloemfontein are well positioned to capitalize on and to meet the needs of the changing economy, while complimenting the future economic growth strategies of the city if the right plans are in place.

Keywords: centrally located land, spatial planning, show grounds, central business district

Procedia PDF Downloads 383
6936 Development of a Spatial Data for Renal Registry in Nigeria Health Sector

Authors: Adekunle Kolawole Ojo, Idowu Peter Adebayo, Egwuche Sylvester O.

Abstract:

Chronic Kidney Disease (CKD) is a significant cause of morbidity and mortality across developed and developing nations and is associated with increased risk. There are no existing electronic means of capturing and monitoring CKD in Nigeria. The work is aimed at developing a spatial data model that can be used to implement renal registries required for tracking and monitoring the spatial distribution of renal diseases by public health officers and patients. In this study, we have developed a spatial data model for a functional renal registry.

Keywords: renal registry, health informatics, chronic kidney disease, interface

Procedia PDF Downloads 158
6935 Environmental Evaluation of Two Kind of Drug Production (Syrup and Pomade Form) Using Life Cycle Assessment Methodology

Authors: H. Aksas, S. Boughrara, K. Louhab

Abstract:

The goal of this study was the use of life cycle assessment (LCA) methodology to assess the environmental impact of pharmaceutical product (four kinds of syrup form and tree kinds of pomade form), which are produced in one leader manufactory in Algeria town that is SAIDAL Company. The impacts generated have evaluated using SimpaPro7.1 with CML92 Method for syrup form and EPD 2007 for pomade form. All impacts evaluated have compared between them, with determination of the compound contributing to each impacts in each case. Data needed to conduct Life Cycle Inventory (LCI) came from this factory, by the collection of theoretical data near the responsible technicians and engineers of the company, the practical data are resulting from the assay of pharmaceutical liquid, obtained at the laboratories of the university. This data represent different raw material imported from European and Asian country necessarily to formulate the drug. Energy used is coming from Algerian resource for the input. Outputs are the result of effluent analysis of this factory with different form (liquid, solid and gas form). All this data (input and output) represent the ecobalance.

Keywords: pharmaceutical product, drug residues, LCA methodology, environmental impacts

Procedia PDF Downloads 228
6934 Preparation of Wireless Networks and Security; Challenges in Efficient Accession of Encrypted Data in Healthcare

Authors: M. Zayoud, S. Oueida, S. Ionescu, P. AbiChar

Abstract:

Background: Wireless sensor network is encompassed of diversified tools of information technology, which is widely applied in a range of domains, including military surveillance, weather forecasting, and earthquake forecasting. Strengthened grounds are always developed for wireless sensor networks, which usually emerges security issues during professional application. Thus, essential technological tools are necessary to be assessed for secure aggregation of data. Moreover, such practices have to be incorporated in the healthcare practices that shall be serving in the best of the mutual interest Objective: Aggregation of encrypted data has been assessed through homomorphic stream cipher to assure its effectiveness along with providing the optimum solutions to the field of healthcare. Methods: An experimental design has been incorporated, which utilized newly developed cipher along with CPU-constrained devices. Modular additions have also been employed to evaluate the nature of aggregated data. The processes of homomorphic stream cipher have been highlighted through different sensors and modular additions. Results: Homomorphic stream cipher has been recognized as simple and secure process, which has allowed efficient aggregation of encrypted data. In addition, the application has led its way to the improvisation of the healthcare practices. Statistical values can be easily computed through the aggregation on the basis of selected cipher. Sensed data in accordance with variance, mean, and standard deviation has also been computed through the selected tool. Conclusion: It can be concluded that homomorphic stream cipher can be an ideal tool for appropriate aggregation of data. Alongside, it shall also provide the best solutions to the healthcare sector.

Keywords: aggregation, cipher, homomorphic stream, encryption

Procedia PDF Downloads 229
6933 Mathematics Bridging Theory and Applications for a Data-Driven World

Authors: Zahid Ullah, Atlas Khan

Abstract:

In today's data-driven world, the role of mathematics in bridging the gap between theory and applications is becoming increasingly vital. This abstract highlights the significance of mathematics as a powerful tool for analyzing, interpreting, and extracting meaningful insights from vast amounts of data. By integrating mathematical principles with real-world applications, researchers can unlock the full potential of data-driven decision-making processes. This abstract delves into the various ways mathematics acts as a bridge connecting theoretical frameworks to practical applications. It explores the utilization of mathematical models, algorithms, and statistical techniques to uncover hidden patterns, trends, and correlations within complex datasets. Furthermore, it investigates the role of mathematics in enhancing predictive modeling, optimization, and risk assessment methodologies for improved decision-making in diverse fields such as finance, healthcare, engineering, and social sciences. The abstract also emphasizes the need for interdisciplinary collaboration between mathematicians, statisticians, computer scientists, and domain experts to tackle the challenges posed by the data-driven landscape. By fostering synergies between these disciplines, novel approaches can be developed to address complex problems and make data-driven insights accessible and actionable. Moreover, this abstract underscores the importance of robust mathematical foundations for ensuring the reliability and validity of data analysis. Rigorous mathematical frameworks not only provide a solid basis for understanding and interpreting results but also contribute to the development of innovative methodologies and techniques. In summary, this abstract advocates for the pivotal role of mathematics in bridging theory and applications in a data-driven world. By harnessing mathematical principles, researchers can unlock the transformative potential of data analysis, paving the way for evidence-based decision-making, optimized processes, and innovative solutions to the challenges of our rapidly evolving society.

Keywords: mathematics, bridging theory and applications, data-driven world, mathematical models

Procedia PDF Downloads 47
6932 Unstructured-Data Content Search Based on Optimized EEG Signal Processing and Multi-Objective Feature Extraction

Authors: Qais M. Yousef, Yasmeen A. Alshaer

Abstract:

Over the last few years, the amount of data available on the globe has been increased rapidly. This came up with the emergence of recent concepts, such as the big data and the Internet of Things, which have furnished a suitable solution for the availability of data all over the world. However, managing this massive amount of data remains a challenge due to their large verity of types and distribution. Therefore, locating the required file particularly from the first trial turned to be a not easy task, due to the large similarities of names for different files distributed on the web. Consequently, the accuracy and speed of search have been negatively affected. This work presents a method using Electroencephalography signals to locate the files based on their contents. Giving the concept of natural mind waves processing, this work analyses the mind wave signals of different people, analyzing them and extracting their most appropriate features using multi-objective metaheuristic algorithm, and then classifying them using artificial neural network to distinguish among files with similar names. The aim of this work is to provide the ability to find the files based on their contents using human thoughts only. Implementing this approach and testing it on real people proved its ability to find the desired files accurately within noticeably shorter time and retrieve them as a first choice for the user.

Keywords: artificial intelligence, data contents search, human active memory, mind wave, multi-objective optimization

Procedia PDF Downloads 151
6931 A Bivariate Inverse Generalized Exponential Distribution and Its Applications in Dependent Competing Risks Model

Authors: Fatemah A. Alqallaf, Debasis Kundu

Abstract:

The aim of this paper is to introduce a bivariate inverse generalized exponential distribution which has a singular component. The proposed bivariate distribution can be used when the marginals have heavy-tailed distributions, and they have non-monotone hazard functions. Due to the presence of the singular component, it can be used quite effectively when there are ties in the data. Since it has four parameters, it is a very flexible bivariate distribution, and it can be used quite effectively for analyzing various bivariate data sets. Several dependency properties and dependency measures have been obtained. The maximum likelihood estimators cannot be obtained in closed form, and it involves solving a four-dimensional optimization problem. To avoid that, we have proposed to use an EM algorithm, and it involves solving only one non-linear equation at each `E'-step. Hence, the implementation of the proposed EM algorithm is very straight forward in practice. Extensive simulation experiments and the analysis of one data set have been performed. We have observed that the proposed bivariate inverse generalized exponential distribution can be used for modeling dependent competing risks data. One data set has been analyzed to show the effectiveness of the proposed model.

Keywords: Block and Basu bivariate distributions, competing risks, EM algorithm, Marshall-Olkin bivariate exponential distribution, maximum likelihood estimators

Procedia PDF Downloads 114
6930 Active Contours for Image Segmentation Based on Complex Domain Approach

Authors: Sajid Hussain

Abstract:

The complex domain approach for image segmentation based on active contour has been designed, which deforms step by step to partition an image into numerous expedient regions. A novel region-based trigonometric complex pressure force function is proposed, which propagates around the region of interest using image forces. The signed trigonometric force function controls the propagation of the active contour and the active contour stops on the exact edges of the object accurately. The proposed model makes the level set function binary and uses Gaussian smoothing kernel to adjust and escape the re-initialization procedure. The working principle of the proposed model is as follows: The real image data is transformed into complex data by iota (i) times of image data and the average iota (i) times of horizontal and vertical components of the gradient of image data is inserted in the proposed model to catch complex gradient of the image data. A simple finite difference mathematical technique has been used to implement the proposed model. The efficiency and robustness of the proposed model have been verified and compared with other state-of-the-art models.

Keywords: image segmentation, active contour, level set, Mumford and Shah model

Procedia PDF Downloads 74
6929 Comparison of Different k-NN Models for Speed Prediction in an Urban Traffic Network

Authors: Seyoung Kim, Jeongmin Kim, Kwang Ryel Ryu

Abstract:

A database that records average traffic speeds measured at five-minute intervals for all the links in the traffic network of a metropolitan city. While learning from this data the models that can predict future traffic speed would be beneficial for the applications such as the car navigation system, building predictive models for every link becomes a nontrivial job if the number of links in a given network is huge. An advantage of adopting k-nearest neighbor (k-NN) as predictive models is that it does not require any explicit model building. Instead, k-NN takes a long time to make a prediction because it needs to search for the k-nearest neighbors in the database at prediction time. In this paper, we investigate how much we can speed up k-NN in making traffic speed predictions by reducing the amount of data to be searched for without a significant sacrifice of prediction accuracy. The rationale behind this is that we had a better look at only the recent data because the traffic patterns not only repeat daily or weekly but also change over time. In our experiments, we build several different k-NN models employing different sets of features which are the current and past traffic speeds of the target link and the neighbor links in its up/down-stream. The performances of these models are compared by measuring the average prediction accuracy and the average time taken to make a prediction using various amounts of data.

Keywords: big data, k-NN, machine learning, traffic speed prediction

Procedia PDF Downloads 333
6928 Logistics Information Systems in the Distribution of Flour in Nigeria

Authors: Cornelius Femi Popoola

Abstract:

This study investigated logistics information systems in the distribution of flour in Nigeria. A case study design was used and 50 staff of Honeywell Flour Mill was sampled for the study. Data generated through a questionnaire were analysed using correlation and regression analysis. The findings of the study revealed that logistic information systems such as e-commerce, interactive telephone systems and electronic data interchange positively correlated with the distribution of flour in Honeywell Flour Mill. Finding also deduced that e-commerce, interactive telephone systems and electronic data interchange jointly and positively contribute to the distribution of flour in Honeywell Flour Mill in Nigeria (R = .935; Adj. R2 = .642; F (3,47) = 14.739; p < .05). The study therefore recommended that Honeywell Flour Mill should upgrade their logistic information systems to computer-to-computer communication of business transactions and documents, as well adopt new technology such as, tracking-and-tracing systems (barcode scanning for packages and palettes), tracking vehicles with Global Positioning System (GPS), measuring vehicle performance with ‘black boxes’ (containing logistic data), and Automatic Equipment Identification (AEI) into their systems.

Keywords: e-commerce, electronic data interchange, flour distribution, information system, interactive telephone systems

Procedia PDF Downloads 522
6927 A Novel Probabilistic Spatial Locality of Reference Technique for Automatic Cleansing of Digital Maps

Authors: A. Abdullah, S. Abushalmat, A. Bakshwain, A. Basuhail, A. Aslam

Abstract:

GIS (Geographic Information System) applications require geo-referenced data, this data could be available as databases or in the form of digital or hard-copy agro-meteorological maps. These parameter maps are color-coded with different regions corresponding to different parameter values, converting these maps into a database is not very difficult. However, text and different planimetric elements overlaid on these maps makes an accurate image to database conversion a challenging problem. The reason being, it is almost impossible to exactly replace what was underneath the text or icons; thus, pointing to the need for inpainting. In this paper, we propose a probabilistic inpainting approach that uses the probability of spatial locality of colors in the map for replacing overlaid elements with underlying color. We tested the limits of our proposed technique using non-textual simulated data and compared text removing results with a popular image editing tool using public domain data with promising results.

Keywords: noise, image, GIS, digital map, inpainting

Procedia PDF Downloads 324
6926 Reversible Information Hitting in Encrypted JPEG Bitstream by LSB Based on Inherent Algorithm

Authors: Vaibhav Barve

Abstract:

Reversible information hiding has drawn a lot of interest as of late. Being reversible, we can restore unique computerized data totally. It is a plan where mystery data is put away in digital media like image, video, audio to maintain a strategic distance from unapproved access and security reason. By and large JPEG bit stream is utilized to store this key data, first JPEG bit stream is encrypted into all around sorted out structure and then this secret information or key data is implanted into this encrypted region by marginally changing the JPEG bit stream. Valuable pixels suitable for information implanting are computed and as indicated by this key subtle elements are implanted. In our proposed framework we are utilizing RC4 algorithm for encrypting JPEG bit stream. Encryption key is acknowledged by framework user which, likewise, will be used at the time of decryption. We are executing enhanced least significant bit supplanting steganography by utilizing genetic algorithm. At first, the quantity of bits that must be installed in a guaranteed coefficient is versatile. By utilizing proper parameters, we can get high capacity while ensuring high security. We are utilizing logistic map for shuffling of bits and utilization GA (Genetic Algorithm) to find right parameters for the logistic map. Information embedding key is utilized at the time of information embedding. By utilizing precise picture encryption and information embedding key, the beneficiary can, without much of a stretch, concentrate the incorporated secure data and totally recoup the first picture and also the original secret information. At the point when the embedding key is truant, the first picture can be recouped pretty nearly with sufficient quality without getting the embedding key of interest.

Keywords: data embedding, decryption, encryption, reversible data hiding, steganography

Procedia PDF Downloads 267
6925 Design and Implementation of Security Middleware for Data Warehouse Signature, Framework

Authors: Mayada Al Meghari

Abstract:

Recently, grid middlewares have provided large integrated use of network resources as the shared data and the CPU to become a virtual supercomputer. In this work, we present the design and implementation of the middleware for Data Warehouse Signature, DWS Framework. The aim of using the middleware in our DWS framework is to achieve the high performance by the parallel computing. This middleware is developed on Alchemi.Net framework to increase the security among the network nodes through the authentication and group-key distribution model. This model achieves the key security and prevents any intermediate attacks in the middleware. This paper presents the flow process structures of the middleware design. In addition, the paper ensures the implementation of security for DWS middleware enhancement with the authentication and group-key distribution model. Finally, from the analysis of other middleware approaches, the developed middleware of DWS framework is the optimal solution of a complete covering of security issues.

Keywords: middleware, parallel computing, data warehouse, security, group-key, high performance

Procedia PDF Downloads 89
6924 Sentiment Classification of Documents

Authors: Swarnadip Ghosh

Abstract:

Sentiment Analysis is the process of detecting the contextual polarity of text. In other words, it determines whether a piece of writing is positive, negative or neutral.Sentiment analysis of documents holds great importance in today's world, when numerous information is stored in databases and in the world wide web. An efficient algorithm to illicit such information, would be beneficial for social, economic as well as medical purposes. In this project, we have developed an algorithm to classify a document into positive or negative. Using our algorithm, we obtained a feature set from the data, and classified the documents based on this feature set. It is important to note that, in the classification, we have not used the independence assumption, which is considered by many procedures like the Naive Bayes. This makes the algorithm more general in scope. Moreover, because of the sparsity and high dimensionality of such data, we did not use empirical distribution for estimation, but developed a method by finding degree of close clustering of the data points. We have applied our algorithm on a movie review data set obtained from IMDb and obtained satisfactory results.

Keywords: sentiment, Run's Test, cross validation, higher dimensional pmf estimation

Procedia PDF Downloads 375
6923 Corporate Governance and Bank Performance: A Study of Selected Deposit Money Banks in Nigeria

Authors: Ayodele Ajayi, John Ajayi

Abstract:

This paper investigates the effect of corporate governance with a view to determining the relationship between board size and bank performance. Data for the study were obtained from the audited financial statements of five sampled banks listed on the Nigerian Stock Exchange. Panel data technique was adopted and analysis was carried out with the use of multiple regression and pooled ordinary least square. Results from the study show that the larger the board size, the greater the profit implying that corporate governance is positively correlated with bank performance.

Keywords: corporate governance, banks performance, board size, pooled data

Procedia PDF Downloads 321
6922 Empowering a New Frontier in Heart Disease Detection: Unleashing Quantum Machine Learning

Authors: Sadia Nasrin Tisha, Mushfika Sharmin Rahman, Javier Orduz

Abstract:

Machine learning is applied in a variety of fields throughout the world. The healthcare sector has benefited enormously from it. One of the most effective approaches for predicting human heart diseases is to use machine learning applications to classify data and predict the outcome as a classification. However, with the rapid advancement of quantum technology, quantum computing has emerged as a potential game-changer for many applications. Quantum algorithms have the potential to execute substantially faster than their classical equivalents, which can lead to significant improvements in computational performance and efficiency. In this study, we applied quantum machine learning concepts to predict coronary heart diseases from text data. We experimented thrice with three different features; and three feature sets. The data set consisted of 100 data points. We pursue to do a comparative analysis of the two approaches, highlighting the potential benefits of quantum machine learning for predicting heart diseases.

Keywords: quantum machine learning, SVM, QSVM, matrix product state

Procedia PDF Downloads 64
6921 Verification & Validation of Map Reduce Program Model for Parallel K-Mediod Algorithm on Hadoop Cluster

Authors: Trapti Sharma, Devesh Kumar Srivastava

Abstract:

This paper is basically a analysis study of above MapReduce implementation and also to verify and validate the MapReduce solution model for Parallel K-Mediod algorithm on Hadoop Cluster. MapReduce is a programming model which authorize the managing of huge amounts of data in parallel, on a large number of devices. It is specially well suited to constant or moderate changing set of data since the implementation point of a position is usually high. MapReduce has slowly become the framework of choice for “big data”. The MapReduce model authorizes for systematic and instant organizing of large scale data with a cluster of evaluate nodes. One of the primary affect in Hadoop is how to minimize the completion length (i.e. makespan) of a set of MapReduce duty. In this paper, we have verified and validated various MapReduce applications like wordcount, grep, terasort and parallel K-Mediod clustering algorithm. We have found that as the amount of nodes increases the completion time decreases.

Keywords: hadoop, mapreduce, k-mediod, validation, verification

Procedia PDF Downloads 342
6920 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization

Procedia PDF Downloads 163
6919 An IoT-Enabled Crop Recommendation System Utilizing Message Queuing Telemetry Transport (MQTT) for Efficient Data Transmission to AI/ML Models

Authors: Prashansa Singh, Rohit Bajaj, Manjot Kaur

Abstract:

In the modern agricultural landscape, precision farming has emerged as a pivotal strategy for enhancing crop yield and optimizing resource utilization. This paper introduces an innovative Crop Recommendation System (CRS) that leverages the Internet of Things (IoT) technology and the Message Queuing Telemetry Transport (MQTT) protocol to collect critical environmental and soil data via sensors deployed across agricultural fields. The system is designed to address the challenges of real-time data acquisition, efficient data transmission, and dynamic crop recommendation through the application of advanced Artificial Intelligence (AI) and Machine Learning (ML) models. The CRS architecture encompasses a network of sensors that continuously monitor environmental parameters such as temperature, humidity, soil moisture, and nutrient levels. This sensor data is then transmitted to a central MQTT server, ensuring reliable and low-latency communication even in bandwidth-constrained scenarios typical of rural agricultural settings. Upon reaching the server, the data is processed and analyzed by AI/ML models trained to correlate specific environmental conditions with optimal crop choices and cultivation practices. These models consider historical crop performance data, current agricultural research, and real-time field conditions to generate tailored crop recommendations. This implementation gets 99% accuracy.

Keywords: Iot, MQTT protocol, machine learning, sensor, publish, subscriber, agriculture, humidity

Procedia PDF Downloads 31
6918 Extracting Opinions from Big Data of Indonesian Customer Reviews Using Hadoop MapReduce

Authors: Veronica S. Moertini, Vinsensius Kevin, Gede Karya

Abstract:

Customer reviews have been collected by many kinds of e-commerce websites selling products, services, hotel rooms, tickets and so on. Each website collects its own customer reviews. The reviews can be crawled, collected from those websites and stored as big data. Text analysis techniques can be used to analyze that data to produce summarized information, such as customer opinions. Then, these opinions can be published by independent service provider websites and used to help customers in choosing the most suitable products or services. As the opinions are analyzed from big data of reviews originated from many websites, it is expected that the results are more trusted and accurate. Indonesian customers write reviews in Indonesian language, which comes with its own structures and uniqueness. We found that most of the reviews are expressed with “daily language”, which is informal, do not follow the correct grammar, have many abbreviations and slangs or non-formal words. Hadoop is an emerging platform aimed for storing and analyzing big data in distributed systems. A Hadoop cluster consists of master and slave nodes/computers operated in a network. Hadoop comes with distributed file system (HDFS) and MapReduce framework for supporting parallel computation. However, MapReduce has weakness (i.e. inefficient) for iterative computations, specifically, the cost of reading/writing data (I/O cost) is high. Given this fact, we conclude that MapReduce function is best adapted for “one-pass” computation. In this research, we develop an efficient technique for extracting or mining opinions from big data of Indonesian reviews, which is based on MapReduce with one-pass computation. In designing the algorithm, we avoid iterative computation and instead adopt a “look up table” technique. The stages of the proposed technique are: (1) Crawling the data reviews from websites; (2) cleaning and finding root words from the raw reviews; (3) computing the frequency of the meaningful opinion words; (4) analyzing customers sentiments towards defined objects. The experiments for evaluating the performance of the technique were conducted on a Hadoop cluster with 14 slave nodes. The results show that the proposed technique (stage 2 to 4) discovers useful opinions, is capable of processing big data efficiently and scalable.

Keywords: big data analysis, Hadoop MapReduce, analyzing text data, mining Indonesian reviews

Procedia PDF Downloads 183
6917 Quality Assurance for the Climate Data Store

Authors: Judith Klostermann, Miguel Segura, Wilma Jans, Dragana Bojovic, Isadora Christel Jimenez, Francisco Doblas-Reyees, Judit Snethlage

Abstract:

The Climate Data Store (CDS), developed by the Copernicus Climate Change Service (C3S) implemented by the European Centre for Medium-Range Weather Forecasts (ECMWF) on behalf of the European Union, is intended to become a key instrument for exploring climate data. The CDS contains both raw and processed data to provide information to the users about the past, present and future climate of the earth. It allows for easy and free access to climate data and indicators, presenting an important asset for scientists and stakeholders on the path for achieving a more sustainable future. The C3S Evaluation and Quality Control (EQC) is assessing the quality of the CDS by undertaking a comprehensive user requirement assessment to measure the users’ satisfaction. Recommendations will be developed for the improvement and expansion of the CDS datasets and products. User requirements will be identified on the fitness of the datasets, the toolbox, and the overall CDS service. The EQC function of the CDS will help C3S to make the service more robust: integrated by validated data that follows high-quality standards while being user-friendly. This function will be closely developed with the users of the service. Through their feedback, suggestions, and contributions, the CDS can become more accessible and meet the requirements for a diverse range of users. Stakeholders and their active engagement are thus an important aspect of CDS development. This will be achieved with direct interactions with users such as meetings, interviews or workshops as well as different feedback mechanisms like surveys or helpdesk services at the CDS. The results provided by the users will be categorized as a function of CDS products so that their specific interests will be monitored and linked to the right product. Through this procedure, we will identify the requirements and criteria for data and products in order to build the correspondent recommendations for the improvement and expansion of the CDS datasets and products.

Keywords: climate data store, Copernicus, quality, user engagement

Procedia PDF Downloads 124
6916 Interpretation and Clustering Framework for Analyzing ECG Survey Data

Authors: Irum Matloob, Shoab Ahmad Khan, Fahim Arif

Abstract:

As Indo-Pak has been the victim of heart diseases since many decades. Many surveys showed that percentage of cardiac patients is increasing in Pakistan day by day, and special attention is needed to pay on this issue. The framework is proposed for performing detailed analysis of ECG survey data which is conducted for measuring prevalence of heart diseases statistics in Pakistan. The ECG survey data is evaluated or filtered by using automated Minnesota codes and only those ECGs are used for further analysis which is fulfilling the standardized conditions mentioned in the Minnesota codes. Then feature selection is performed by applying proposed algorithm based on discernibility matrix, for selecting relevant features from the database. Clustering is performed for exposing natural clusters from the ECG survey data by applying spectral clustering algorithm using fuzzy c means algorithm. The hidden patterns and interesting relationships which have been exposed after this analysis are useful for further detailed analysis and for many other multiple purposes.

Keywords: arrhythmias, centroids, ECG, clustering, discernibility matrix

Procedia PDF Downloads 443
6915 A Virtual Grid Based Energy Efficient Data Gathering Scheme for Heterogeneous Sensor Networks

Authors: Siddhartha Chauhan, Nitin Kumar Kotania

Abstract:

Traditional Wireless Sensor Networks (WSNs) generally use static sinks to collect data from the sensor nodes via multiple forwarding. Therefore, network suffers with some problems like long message relay time, bottle neck problem which reduces the performance of the network. Many approaches have been proposed to prevent this problem with the help of mobile sink to collect the data from the sensor nodes, but these approaches still suffer from the buffer overflow problem due to limited memory size of sensor nodes. This paper proposes an energy efficient scheme for data gathering which overcomes the buffer overflow problem. The proposed scheme creates virtual grid structure of heterogeneous nodes. Scheme has been designed for sensor nodes having variable sensing rate. Every node finds out its buffer overflow time and on the basis of this cluster heads are elected. A controlled traversing approach is used by the proposed scheme in order to transmit data to sink. The effectiveness of the proposed scheme is verified by simulation.

Keywords: buffer overflow problem, mobile sink, virtual grid, wireless sensor networks

Procedia PDF Downloads 356
6914 Analysis of ECGs Survey Data by Applying Clustering Algorithm

Authors: Irum Matloob, Shoab Ahmad Khan, Fahim Arif

Abstract:

As Indo-pak has been the victim of heart diseases since many decades. Many surveys showed that percentage of cardiac patients is increasing in Pakistan day by day, and special attention is needed to pay on this issue. The framework is proposed for performing detailed analysis of ECG survey data which is conducted for measuring the prevalence of heart diseases statistics in Pakistan. The ECG survey data is evaluated or filtered by using automated Minnesota codes and only those ECGs are used for further analysis which is fulfilling the standardized conditions mentioned in the Minnesota codes. Then feature selection is performed by applying proposed algorithm based on discernibility matrix, for selecting relevant features from the database. Clustering is performed for exposing natural clusters from the ECG survey data by applying spectral clustering algorithm using fuzzy c means algorithm. The hidden patterns and interesting relationships which have been exposed after this analysis are useful for further detailed analysis and for many other multiple purposes.

Keywords: arrhythmias, centroids, ECG, clustering, discernibility matrix

Procedia PDF Downloads 328