Search results for: data block
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25631

Search results for: data block

24911 Dissimilarity Measure for General Histogram Data and Its Application to Hierarchical Clustering

Authors: K. Umbleja, M. Ichino

Abstract:

Symbolic data mining has been developed to analyze data in very large datasets. It is also useful in cases when entry specific details should remain hidden. Symbolic data mining is quickly gaining popularity as datasets in need of analyzing are becoming ever larger. One type of such symbolic data is a histogram, which enables to save huge amounts of information into a single variable with high-level of granularity. Other types of symbolic data can also be described in histograms, therefore making histogram a very important and general symbolic data type - a method developed for histograms - can also be applied to other types of symbolic data. Due to its complex structure, analyzing histograms is complicated. This paper proposes a method, which allows to compare two histogram-valued variables and therefore find a dissimilarity between two histograms. Proposed method uses the Ichino-Yaguchi dissimilarity measure for mixed feature-type data analysis as a base and develops a dissimilarity measure specifically for histogram data, which allows to compare histograms with different number of bins and bin widths (so called general histogram). Proposed dissimilarity measure is then used as a measure for clustering. Furthermore, linkage method based on weighted averages is proposed with the concept of cluster compactness to measure the quality of clustering. The method is then validated with application on real datasets. As a result, the proposed dissimilarity measure is found producing adequate and comparable results with general histograms without the loss of detail or need to transform the data.

Keywords: dissimilarity measure, hierarchical clustering, histograms, symbolic data analysis

Procedia PDF Downloads 157
24910 WiFi Data Offloading: Bundling Method in a Canvas Business Model

Authors: Majid Mokhtarnia, Alireza Amini

Abstract:

Mobile operators deal with increasing in the data traffic as a critical issue. As a result, a vital responsibility of the operators is to deal with such a trend in order to create added values. This paper addresses a bundling method in a Canvas business model in a WiFi Data Offloading (WDO) strategy by which some elements of the model may be affected. In the proposed method, it is supposed to sell a number of data packages for subscribers in which there are some packages with a free given volume of data-offloaded WiFi complimentary. The paper on hands analyses this method in the views of attractiveness and profitability. The results demonstrate that the quality of implementation of the WDO strongly affects the final result and helps the decision maker to make the best one.

Keywords: bundling, canvas business model, telecommunication, WiFi data offloading

Procedia PDF Downloads 193
24909 Modern Construction Methods and Technologies and Their Impacts on Construction Projects

Authors: Michael Anthony Doherty

Abstract:

Modern Methods of Construction (MMC) is a significant topic in the construction industry; while reviewing (MMC) over different fields that are significant in the modern construction world, the following areas were assessed where (MMC) is developing, supply chain management, automation, digital technology, and new construction technologies. Different methods were considered as an approach to research and exploring areas highlighted within the construction industry that are making advancements using Modern Methods of Construction Methods and Technologies (MCMTs). The research was conducted using the following methodologies, literature review of academic sources, primary and secondary data sources, questionaries, and interviews. The paper is composed of two parts, firstly a literature review and secondly a questionnaire used as the basis for interviews were utilised to achieve the following key objectives: to identify (MCMTs) being implemented in the construction industry, research and compile information with regards to these methods, determine their purpose and their application in the industry, establishing what (MCMTs) are being used in the industry while also determining the success of the methods. The research considers the evolution and development of these methods in projects and within the industry itself. Major findings were as follows; automation technologies such as robotics, offsite fabrication utilising automated production lines are increasingly part of project execution, digital technologies such as AR and VR are increasingly utilised in project co-ordination, (MMCTs) are proving to be a solution to the construction industry problems such as a lack of skilled workforce, hazardous work tasks, and situations, new construction technologies are available and finding their place in mainstream construction, (SCM) and (GSCM) are evolving to new levels using new systems and technologies such as block chain technology as well as Company Size and Project size influence the use of (MMCTs) and the adoption of (MMCTS). In summary the paper endeavours to identify and detail how areas of (MMCTs) are developing and are gaining traction within mainstream construction.

Keywords: automation, digital technology, new construction technologies, supply chain management

Procedia PDF Downloads 60
24908 Distributed Perceptually Important Point Identification for Time Series Data Mining

Authors: Tak-Chung Fu, Ying-Kit Hung, Fu-Lai Chung

Abstract:

In the field of time series data mining, the concept of the Perceptually Important Point (PIP) identification process is first introduced in 2001. This process originally works for financial time series pattern matching and it is then found suitable for time series dimensionality reduction and representation. Its strength is on preserving the overall shape of the time series by identifying the salient points in it. With the rise of Big Data, time series data contributes a major proportion, especially on the data which generates by sensors in the Internet of Things (IoT) environment. According to the nature of PIP identification and the successful cases, it is worth to further explore the opportunity to apply PIP in time series ‘Big Data’. However, the performance of PIP identification is always considered as the limitation when dealing with ‘Big’ time series data. In this paper, two distributed versions of PIP identification based on the Specialized Binary (SB) Tree are proposed. The proposed approaches solve the bottleneck when running the PIP identification process in a standalone computer. Improvement in term of speed is obtained by the distributed versions.

Keywords: distributed computing, performance analysis, Perceptually Important Point identification, time series data mining

Procedia PDF Downloads 428
24907 Analysing Techniques for Fusing Multimodal Data in Predictive Scenarios Using Convolutional Neural Networks

Authors: Philipp Ruf, Massiwa Chabbi, Christoph Reich, Djaffar Ould-Abdeslam

Abstract:

In recent years, convolutional neural networks (CNN) have demonstrated high performance in image analysis, but oftentimes, there is only structured data available regarding a specific problem. By interpreting structured data as images, CNNs can effectively learn and extract valuable insights from tabular data, leading to improved predictive accuracy and uncovering hidden patterns that may not be apparent in traditional structured data analysis. In applying a single neural network for analyzing multimodal data, e.g., both structured and unstructured information, significant advantages in terms of time complexity and energy efficiency can be achieved. Converting structured data into images and merging them with existing visual material offers a promising solution for applying CNN in multimodal datasets, as they often occur in a medical context. By employing suitable preprocessing techniques, structured data is transformed into image representations, where the respective features are expressed as different formations of colors and shapes. In an additional step, these representations are fused with existing images to incorporate both types of information. This final image is finally analyzed using a CNN.

Keywords: CNN, image processing, tabular data, mixed dataset, data transformation, multimodal fusion

Procedia PDF Downloads 119
24906 Knowledge Discovery and Data Mining Techniques in Textile Industry

Authors: Filiz Ersoz, Taner Ersoz, Erkin Guler

Abstract:

This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.

Keywords: data mining, textile production, decision trees, classification

Procedia PDF Downloads 346
24905 Investigation of Delivery of Triple Play Data in GE-PON Fiber to the Home Network

Authors: Ashima Anurag Sharma

Abstract:

Optical fiber based networks can deliver performance that can support the increasing demands for high speed connections. One of the new technologies that have emerged in recent years is Passive Optical Networks. This research paper is targeted to show the simultaneous delivery of triple play service (data, voice, and video). The comparison between various data rates is presented. It is demonstrated that as we increase the data rate, number of users to be decreases due to increase in bit error rate.

Keywords: BER, PON, TDMPON, GPON, CWDM, OLT, ONT

Procedia PDF Downloads 523
24904 Microarray Gene Expression Data Dimensionality Reduction Using PCA

Authors: Fuad M. Alkoot

Abstract:

Different experimental technologies such as microarray sequencing have been proposed to generate high-resolution genetic data, in order to understand the complex dynamic interactions between complex diseases and the biological system components of genes and gene products. However, the generated samples have a very large dimension reaching thousands. Therefore, hindering all attempts to design a classifier system that can identify diseases based on such data. Additionally, the high overlap in the class distributions makes the task more difficult. The data we experiment with is generated for the identification of autism. It includes 142 samples, which is small compared to the large dimension of the data. The classifier systems trained on this data yield very low classification rates that are almost equivalent to a guess. We aim at reducing the data dimension and improve it for classification. Here, we experiment with applying a multistage PCA on the genetic data to reduce its dimensionality. Results show a significant improvement in the classification rates which increases the possibility of building an automated system for autism detection.

Keywords: PCA, gene expression, dimensionality reduction, classification, autism

Procedia PDF Downloads 557
24903 Data Science-Based Key Factor Analysis and Risk Prediction of Diabetic

Authors: Fei Gao, Rodolfo C. Raga Jr.

Abstract:

This research proposal will ascertain the major risk factors for diabetes and to design a predictive model for risk assessment. The project aims to improve diabetes early detection and management by utilizing data science techniques, which may improve patient outcomes and healthcare efficiency. The phase relation values of each attribute were used to analyze and choose the attributes that might influence the examiner's survival probability using Diabetes Health Indicators Dataset from Kaggle’s data as the research data. We compare and evaluate eight machine learning algorithms. Our investigation begins with comprehensive data preprocessing, including feature engineering and dimensionality reduction, aimed at enhancing data quality. The dataset, comprising health indicators and medical data, serves as a foundation for training and testing these algorithms. A rigorous cross-validation process is applied, and we assess their performance using five key metrics like accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). After analyzing the data characteristics, investigate their impact on the likelihood of diabetes and develop corresponding risk indicators.

Keywords: diabetes, risk factors, predictive model, risk assessment, data science techniques, early detection, data analysis, Kaggle

Procedia PDF Downloads 71
24902 A Methodology to Integrate Data in the Company Based on the Semantic Standard in the Context of Industry 4.0

Authors: Chang Qin, Daham Mustafa, Abderrahmane Khiat, Pierre Bienert, Paulo Zanini

Abstract:

Nowadays, companies are facing lots of challenges in the process of digital transformation, which can be a complex and costly undertaking. Digital transformation involves the collection and analysis of large amounts of data, which can create challenges around data management and governance. Furthermore, it is also challenged to integrate data from multiple systems and technologies. Although with these pains, companies are still pursuing digitalization because by embracing advanced technologies, companies can improve efficiency, quality, decision-making, and customer experience while also creating different business models and revenue streams. In this paper, the issue that data is stored in data silos with different schema and structures is focused. The conventional approaches to addressing this issue involve utilizing data warehousing, data integration tools, data standardization, and business intelligence tools. However, these approaches primarily focus on the grammar and structure of the data and neglect the importance of semantic modeling and semantic standardization, which are essential for achieving data interoperability. In this session, the challenge of data silos in Industry 4.0 is addressed by developing a semantic modeling approach compliant with Asset Administration Shell (AAS) models as an efficient standard for communication in Industry 4.0. The paper highlights how our approach can facilitate the data mapping process and semantic lifting according to existing industry standards such as ECLASS and other industrial dictionaries. It also incorporates the Asset Administration Shell technology to model and map the company’s data and utilize a knowledge graph for data storage and exploration.

Keywords: data interoperability in industry 4.0, digital integration, industrial dictionary, semantic modeling

Procedia PDF Downloads 90
24901 Constant Order Predictor Corrector Method for the Solution of Modeled Problems of First Order IVPs of ODEs

Authors: A. A. James, A. O. Adesanya, M. R. Odekunle, D. G. Yakubu

Abstract:

This paper examines the development of one step, five hybrid point method for the solution of first order initial value problems. We adopted the method of collocation and interpolation of power series approximate solution to generate a continuous linear multistep method. The continuous linear multistep method was evaluated at selected grid points to give the discrete linear multistep method. The method was implemented using a constant order predictor of order seven over an overlapping interval. The basic properties of the derived corrector was investigated and found to be zero stable, consistent and convergent. The region of absolute stability was also investigated. The method was tested on some numerical experiments and found to compete favorably with the existing methods.

Keywords: interpolation, approximate solution, collocation, differential system, half step, converges, block method, efficiency

Procedia PDF Downloads 333
24900 Characterization of Inkjet-Printed Carbon Nanotube Electrode Patterns on Cotton Fabric

Authors: N. Najafi, Laleh Maleknia , M. E. Olya

Abstract:

An aqueous conductive ink of single-walled carbon nanotubes for inkjet printing was formulated. To prepare the homogeneous SWCNT ink in a size small enough not to block a commercial inkjet printer nozzle, we used a kinetic ball-milling process to disperse the SWCNTs in an aqueous suspension. When a patterned electrode was overlaid by repeated inkjet printings of the ink on various types of fabric, the fabric resistance decreased rapidly following a power law, reaching approximately 760 X/sq, which is the lowest value ever for a dozen printings. The Raman and Fourier transform infrared spectra revealed that the oxidation of the SWCNTs was the source of the doped impurities. This study proved also that the droplet ejection velocity can have an impact on the CNT distribution and consequently on the electrical performances of the ink.

Keywords: ink-jet printing, carbon nanotube, fabric ink, cotton fabric, raman spectroscopy, fourier transform infrared spectroscopy, dozen printings

Procedia PDF Downloads 418
24899 Insecticidal Effects of Plant Extracts of Thymus daenensis and Eucalyptus camaldulensis on Callosobruchus maculatus (Coleoptera: Bruchidae)

Authors: Afsoon Danesh Afrooz, Sohrab Imani, Ali Ahadiyat, Aref Maroof, Yahya Ostadi

Abstract:

This study has been investigated for finding alternative and safe botanical pesticides instead of chemical insecticides. The effects of plant extracts of Eucalyptus camaldulensis and Thymus daenensis were tested against adult of Callosobrochus maculatus F. Experiments were carried out at 27±1°C and 60 ± 5% R. H. under dark condition with adopting a complete randomized block design. Three replicates were set up for five concentrations of each plants extract. LC50 values were determined by SPSS 16.0 software. LC50 values indicated that plant extract of Thymus daenensis with 1.708 (µl/l air) against adult was more effective than the plant extract of Eucalyptus camaldulensis with LC50 12.755 (µl/l air). It was found that plant extract of Thymus daenensis in comparison with extract of Eucalyptus camaldulensis could be used as a pesticide for control store pests.

Keywords: callosobruchus maculatus, Eucalyptus camaldulensis, insecticidal effects, Thymus daenensis

Procedia PDF Downloads 322
24898 Big Data Analytics and Data Security in the Cloud via Fully Homomorphic Encryption

Authors: Waziri Victor Onomza, John K. Alhassan, Idris Ismaila, Noel Dogonyaro Moses

Abstract:

This paper describes the problem of building secure computational services for encrypted information in the Cloud Computing without decrypting the encrypted data; therefore, it meets the yearning of computational encryption algorithmic aspiration model that could enhance the security of big data for privacy, confidentiality, availability of the users. The cryptographic model applied for the computational process of the encrypted data is the Fully Homomorphic Encryption Scheme. We contribute theoretical presentations in high-level computational processes that are based on number theory and algebra that can easily be integrated and leveraged in the Cloud computing with detail theoretic mathematical concepts to the fully homomorphic encryption models. This contribution enhances the full implementation of big data analytics based cryptographic security algorithm.

Keywords: big data analytics, security, privacy, bootstrapping, homomorphic, homomorphic encryption scheme

Procedia PDF Downloads 374
24897 Approaches to Eco-Friendly Architecture: Modules Assembled Specially to Conserve

Authors: Arshleen Kaur, Sarang Barbarwar, Madhusudan Hamirwasia

Abstract:

Sustainable architecture is going to be the soul of construction in the near future, with building material as a vital link connecting sustainability to construction. The priority in Architecture has shifted from having a lesser negative footprint to having a positive footprint on Earth. The design has to be eco-centric as well as anthro-centric so as to attain its true purpose. Brick holds the same importance like a cell holds in one’s body. The study focuses on this basic building block with an experimental material and technique known as Module Assembled Specially to Conserve (MASC). The study explores the usage and construction of these modules in the construction of buildings. It also shows the impact assessment of the modules on the environment and its significance in reducing the carbon footprint of the construction industry. The aspects like cost-effectiveness, ease of working and reusability of MASC have been studied as well.

Keywords: anthro-centric, carbon footprint, eco-centric, sustainable

Procedia PDF Downloads 173
24896 Dynamic Building Simulation Based Study to Understand Thermal Behavior of High-Rise Structural Timber Buildings

Authors: Timothy O. Adekunle, Sigridur Bjarnadottir

Abstract:

Several studies have investigated thermal behavior of buildings with limited studies focusing on high-rise buildings. Of the limited investigations that have considered thermal performance of high-rise buildings, only a few studies have considered thermal behavior of high-rise structural sustainable buildings. As a result, this study investigates the thermal behavior of a high-rise structural timber building. The study aims to understand the thermal environment of a high-rise structural timber block of apartments located in East London, UK by comparing the indoor environmental conditions at different floors (ground and upper floors) of the building. The environmental variables (temperature and relative humidity) were measured at 15-minute intervals for a few weeks in the summer of 2012 to generate data that was considered for calibration and validation of the simulated results. The study employed mainly dynamic thermal building simulation using DesignBuilder by EnergyPlus and supplemented with environmental monitoring as major techniques for data collection and analysis. The weather file (Test Reference Years- TRYs) for the 2000s from the weather generator carried out by the Prometheus Group was considered for the simulation since the study focuses on investigating thermal behavior of high-rise structural timber buildings in the summertime and not in extreme summertime. In this study, the simulated results (May-September of the 2000s) will be the focus of discussion, but the results will be briefly compared with the environmental monitoring results. The simulated results followed a similar trend with the findings obtained from the short period of the environmental monitoring at the building. The results revealed lower temperatures are often predicted (at least 1.1°C lower) at the ground floor than the predicted temperatures at the upper floors. The simulated results also showed that higher temperatures are predicted in spaces at southeast facing (at least 0.5°C higher) than spaces in other orientations across the floors considered. There is, however, a noticeable difference between the thermal environment of spaces when the results obtained from the environmental monitoring are compared with the simulated results. The field survey revealed higher temperatures were recorded in the living areas (at least 1.0°C higher) while higher temperatures are predicted in bedrooms (at least 0.9°C) than living areas for the simulation. In addition, the simulated results showed spaces on lower floors of high-rise structural timber buildings are predicted to provide more comfortable thermal environment than spaces on upper floors in summer, but this may not be the same in wintertime due to high upward movement of hot air to spaces on upper floors.

Keywords: building simulation, high-rise, structural timber buildings, sustainable, temperatures, thermal behavior

Procedia PDF Downloads 173
24895 Impact of Agricultural Infrastructure on Diffusion of Technology of the Sample Farmers in North 24 Parganas District, West Bengal

Authors: Saikat Majumdar, D. C. Kalita

Abstract:

The Agriculture sector plays an important role in the rural economy of India. It is the backbone of our Indian economy and is the dominant sector in terms of employment and livelihood. Agriculture still contributes significantly to export earnings and is an important source of raw materials as well as of demand for many industrial products particularly fertilizers, pesticides, agricultural implements and a variety of consumer goods, etc. The performance of the agricultural sector influences the growth of Indian economy. According to the 2011 Agricultural Census of India, an estimated 61.5 percentage of rural populations are dependent on agriculture. Proper Agricultural infrastructure has the potential to transform the existing traditional agriculture into a most modern, commercial and dynamic farming system in India through its diffusion of technology. The rate of adoption of modern technology reflects the progress of development in agricultural sector. The adoption of any improved agricultural technology is also dependent on the development of road infrastructure or road network. The present study was consisting of 300 sample farmers out which 150 samples was taken from the developed area and rest 150 samples was taken from underdeveloped area. The samples farmers under develop and underdeveloped areas were collected by using Multistage Random Sampling procedure. In the first stage, North 24 Parganas District have been selected purposively. Then from the district, one developed and one underdeveloped block was selected randomly. In the third phase, 10 villages have been selected randomly from each block. Finally, from each village 15 sample farmers was selected randomly. The extents of adoption of technology in different areas were calculated through various parameters. These are percentage area under High Yielding Variety Cereals, percentage area under High Yielding Variety pulses, area under hybrids vegetables, irrigated area, mechanically operated area, amount spent on fertilizer and pesticides, etc. in both developed and underdeveloped areas of North 24 Parganas District, West Bengal. The percentage area under High Yielding Variety Cereals in the developed and underdeveloped areas was 34.86 and 22.59. 42.07 percentages and 31.46 percentages for High Yielding Variety pulses respectively. In the case the area under irrigation it was 57.66 and 35.71 percent while for the mechanically operated area it was 10.60 and 3.13 percent respectively in developed and underdeveloped areas of North 24 Parganas district, West Bengal. It clearly showed that the extent of adoption of technology was significantly higher in the developed area over underdeveloped area. Better road network system helps the farmers in increasing his farm income, farm assets, cropping intensity, marketed surplus and the rate of adoption of new technology. With this background, an attempt is made in this paper to study the impact of Agricultural Infrastructure on the adoption of modern technology in agriculture in North 24 Parganas District, West Bengal.

Keywords: agricultural infrastructure, adoption of technology, farm income, road network

Procedia PDF Downloads 98
24894 Protecting Privacy and Data Security in Online Business

Authors: Bilquis Ferdousi

Abstract:

With the exponential growth of the online business, the threat to consumers’ privacy and data security has become a serious challenge. This literature review-based study focuses on a better understanding of those threats and what legislative measures have been taken to address those challenges. Research shows that people are increasingly involved in online business using different digital devices and platforms, although this practice varies based on age groups. The threat to consumers’ privacy and data security is a serious hindrance in developing trust among consumers in online businesses. There are some legislative measures taken at the federal and state level to protect consumers’ privacy and data security. The study was based on an extensive review of current literature on protecting consumers’ privacy and data security and legislative measures that have been taken.

Keywords: privacy, data security, legislation, online business

Procedia PDF Downloads 100
24893 Flowing Online Vehicle GPS Data Clustering Using a New Parallel K-Means Algorithm

Authors: Orhun Vural, Oguz Bayat, Rustu Akay, Osman N. Ucan

Abstract:

This study presents a new parallel approach clustering of GPS data. Evaluation has been made by comparing execution time of various clustering algorithms on GPS data. This paper aims to propose a parallel based on neighborhood K-means algorithm to make it faster. The proposed parallelization approach assumes that each GPS data represents a vehicle and to communicate between vehicles close to each other after vehicles are clustered. This parallelization approach has been examined on different sized continuously changing GPS data and compared with serial K-means algorithm and other serial clustering algorithms. The results demonstrated that proposed parallel K-means algorithm has been shown to work much faster than other clustering algorithms.

Keywords: parallel k-means algorithm, parallel clustering, clustering algorithms, clustering on flowing data

Procedia PDF Downloads 217
24892 An Analysis of Privacy and Security for Internet of Things Applications

Authors: Dhananjay Singh, M. Abdullah-Al-Wadud

Abstract:

The Internet of Things is a concept of a large scale ecosystem of wireless actuators. The actuators are defined as things in the IoT, those which contribute or produces some data to the ecosystem. However, ubiquitous data collection, data security, privacy preserving, large volume data processing, and intelligent analytics are some of the key challenges into the IoT technologies. In order to solve the security requirements, challenges and threats in the IoT, we have discussed a message authentication mechanism for IoT applications. Finally, we have discussed data encryption mechanism for messages authentication before propagating into IoT networks.

Keywords: Internet of Things (IoT), message authentication, privacy, security

Procedia PDF Downloads 379
24891 Cognitive Science Based Scheduling in Grid Environment

Authors: N. D. Iswarya, M. A. Maluk Mohamed, N. Vijaya

Abstract:

Grid is infrastructure that allows the deployment of distributed data in large size from multiple locations to reach a common goal. Scheduling data intensive applications becomes challenging as the size of data sets are very huge in size. Only two solutions exist in order to tackle this challenging issue. First, computation which requires huge data sets to be processed can be transferred to the data site. Second, the required data sets can be transferred to the computation site. In the former scenario, the computation cannot be transferred since the servers are storage/data servers with little or no computational capability. Hence, the second scenario can be considered for further exploration. During scheduling, transferring huge data sets from one site to another site requires more network bandwidth. In order to mitigate this issue, this work focuses on incorporating cognitive science in scheduling. Cognitive Science is the study of human brain and its related activities. Current researches are mainly focused on to incorporate cognitive science in various computational modeling techniques. In this work, the problem solving approach of human brain is studied and incorporated during the data intensive scheduling in grid environments. Here, a cognitive engine is designed and deployed in various grid sites. The intelligent agents present in CE will help in analyzing the request and creating the knowledge base. Depending upon the link capacity, decision will be taken whether to transfer data sets or to partition the data sets. Prediction of next request is made by the agents to serve the requesting site with data sets in advance. This will reduce the data availability time and data transfer time. Replica catalog and Meta data catalog created by the agents assist in decision making process.

Keywords: data grid, grid workflow scheduling, cognitive artificial intelligence

Procedia PDF Downloads 391
24890 Heritage and Tourism in the Era of Big Data: Analysis of Chinese Cultural Tourism in Catalonia

Authors: Xinge Liao, Francesc Xavier Roige Ventura, Dolores Sanchez Aguilera

Abstract:

With the development of the Internet, the study of tourism behavior has rapidly expanded from the traditional physical market to the online market. Data on the Internet is characterized by dynamic changes, and new data appear all the time. In recent years the generation of a large volume of data was characterized, such as forums, blogs, and other sources, which have expanded over time and space, together they constitute large-scale Internet data, known as Big Data. This data of technological origin that derives from the use of devices and the activity of multiple users is becoming a source of great importance for the study of geography and the behavior of tourists. The study will focus on cultural heritage tourist practices in the context of Big Data. The research will focus on exploring the characteristics and behavior of Chinese tourists in relation to the cultural heritage of Catalonia. Geographical information, target image, perceptions in user-generated content will be studied through data analysis from Weibo -the largest social networks of blogs in China. Through the analysis of the behavior of heritage tourists in the Big Data environment, this study will understand the practices (activities, motivations, perceptions) of cultural tourists and then understand the needs and preferences of tourists in order to better guide the sustainable development of tourism in heritage sites.

Keywords: Barcelona, Big Data, Catalonia, cultural heritage, Chinese tourism market, tourists’ behavior

Procedia PDF Downloads 136
24889 Towards A Framework for Using Open Data for Accountability: A Case Study of A Program to Reduce Corruption

Authors: Darusalam, Jorish Hulstijn, Marijn Janssen

Abstract:

Media has revealed a variety of corruption cases in the regional and local governments all over the world. Many governments pursued many anti-corruption reforms and have created a system of checks and balances. Three types of corruption are faced by citizens; administrative corruption, collusion and extortion. Accountability is one of the benchmarks for building transparent government. The public sector is required to report the results of the programs that have been implemented so that the citizen can judge whether the institution has been working such as economical, efficient and effective. Open Data is offering solutions for the implementation of good governance in organizations who want to be more transparent. In addition, Open Data can create transparency and accountability to the community. The objective of this paper is to build a framework of open data for accountability to combating corruption. This paper will investigate the relationship between open data, and accountability as part of anti-corruption initiatives. This research will investigate the impact of open data implementation on public organization.

Keywords: open data, accountability, anti-corruption, framework

Procedia PDF Downloads 328
24888 Syndromic Surveillance Framework Using Tweets Data Analytics

Authors: David Ming Liu, Benjamin Hirsch, Bashir Aden

Abstract:

Syndromic surveillance is to detect or predict disease outbreaks through the analysis of medical sources of data. Using social media data like tweets to do syndromic surveillance becomes more and more popular with the aid of open platform to collect data and the advantage of microblogging text and mobile geographic location features. In this paper, a Syndromic Surveillance Framework is presented with machine learning kernel using tweets data analytics. Influenza and the three cities Abu Dhabi, Al Ain and Dubai of United Arabic Emirates are used as the test disease and trial areas. Hospital cases data provided by the Health Authority of Abu Dhabi (HAAD) are used for the correlation purpose. In our model, Latent Dirichlet allocation (LDA) engine is adapted to do supervised learning classification and N-Fold cross validation confusion matrix are given as the simulation results with overall system recall 85.595% performance achieved.

Keywords: Syndromic surveillance, Tweets, Machine Learning, data mining, Latent Dirichlet allocation (LDA), Influenza

Procedia PDF Downloads 108
24887 Liver Tumor Detection by Classification through FD Enhancement of CT Image

Authors: N. Ghatwary, A. Ahmed, H. Jalab

Abstract:

In this paper, an approach for the liver tumor detection in computed tomography (CT) images is represented. The detection process is based on classifying the features of target liver cell to either tumor or non-tumor. Fractional differential (FD) is applied for enhancement of Liver CT images, with the aim of enhancing texture and edge features. Later on, a fusion method is applied to merge between the various enhanced images and produce a variety of feature improvement, which will increase the accuracy of classification. Each image is divided into NxN non-overlapping blocks, to extract the desired features. Support vector machines (SVM) classifier is trained later on a supplied dataset different from the tested one. Finally, the block cells are identified whether they are classified as tumor or not. Our approach is validated on a group of patients’ CT liver tumor datasets. The experiment results demonstrated the efficiency of detection in the proposed technique.

Keywords: fractional differential (FD), computed tomography (CT), fusion, aplha, texture features.

Procedia PDF Downloads 354
24886 Biostimulant and Abiotic Plant Stress Interactions in Malting Barley: A Glasshouse Study

Authors: Conor Blunt, Mariluz del Pino-de Elias, Grace Cott, Saoirse Tracy, Rainer Melzer

Abstract:

The European Green Deal announced in 2021 details agricultural chemical pesticide use and synthetic fertilizer application to be reduced by 50% and 20% by 2030. Increasing and maintaining expected yields under these ambitious goals has strained the agricultural sector. This intergovernmental plan has identified plant biostimulants as one potential input to facilitate this new phase of sustainable agriculture; these products are defined as microorganisms or substances that can stimulate soil and plant functioning to enhance crop nutrient use efficiency, quality and tolerance to abiotic stresses. Spring barley is Ireland’s most widely sown tillage crop, and grain destined for malting commands the most significant market price. Heavy erratic rainfall is forecasted in Ireland’s climate future, and barley is particularly susceptible to waterlogging. Recent findings suggest that plant receptivity to biostimulants may depend on the level of stress inflicted on crops to elicit an assisted plant response. In this study, three biostimulants of different genesis (seaweed, protein hydrolysate and bacteria) are applied to ‘RGT Planet’ malting barley fertilized at three different rates (0 kg/ha, 40 kg/ha, 75 kg/ha) of calcium ammonium nitrogen (27% N) under non-stressed and waterlogged conditions. This 4x3x2 factorial trial design was planted in a completed randomized block with one plant per experimental unit. Leaf gas exchange data and key agronomic and grain quality parameters were analyzed via ANOVA. No penalty on productivity was evident on plants receiving 40 kg/ha of N and bio stimulant compared to 75 kg/ha of N treatments. The main effects of nitrogen application and waterlogging provided the most significant variation in the dataset.

Keywords: biostimulant, Barley, malting, NUE, waterlogging

Procedia PDF Downloads 72
24885 Analysis of Urban Population Using Twitter Distribution Data: Case Study of Makassar City, Indonesia

Authors: Yuyun Wabula, B. J. Dewancker

Abstract:

In the past decade, the social networking app has been growing very rapidly. Geolocation data is one of the important features of social media that can attach the user's location coordinate in the real world. This paper proposes the use of geolocation data from the Twitter social media application to gain knowledge about urban dynamics, especially on human mobility behavior. This paper aims to explore the relation between geolocation Twitter with the existence of people in the urban area. Firstly, the study will analyze the spread of people in the particular area, within the city using Twitter social media data. Secondly, we then match and categorize the existing place based on the same individuals visiting. Then, we combine the Twitter data from the tracking result and the questionnaire data to catch the Twitter user profile. To do that, we used the distribution frequency analysis to learn the visitors’ percentage. To validate the hypothesis, we compare it with the local population statistic data and land use mapping released by the city planning department of Makassar local government. The results show that there is the correlation between Twitter geolocation and questionnaire data. Thus, integration the Twitter data and survey data can reveal the profile of the social media users.

Keywords: geolocation, Twitter, distribution analysis, human mobility

Procedia PDF Downloads 312
24884 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining

Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser

Abstract:

Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.

Keywords: classification, coronary artery disease, data-mining, knowledge discovery, extract

Procedia PDF Downloads 655
24883 Sensor Data Analysis for a Large Mining Major

Authors: Sudipto Shanker Dasgupta

Abstract:

One of the largest mining companies wanted to look at health analytics for their driverless trucks. These trucks were the key to their supply chain logistics. The automated trucks had multi-level sub-assemblies which would send out sensor information. The use case that was worked on was to capture the sensor signal from the truck subcomponents and analyze the health of the trucks from repair and replacement purview. Open source software was used to stream the data into a clustered Hadoop setup in Amazon Web Services cloud and Apache Spark SQL was used to analyze the data. All of this was achieved through a 10 node amazon 32 core, 64 GB RAM setup real-time analytics was achieved on ‘300 million records’. To check the scalability of the system, the cluster was increased to 100 node setup. This talk will highlight how Open Source software was used to achieve the above use case and the insights on the high data throughput on a cloud set up.

Keywords: streaming analytics, data science, big data, Hadoop, high throughput, sensor data

Procedia PDF Downloads 400
24882 Financial Liberalization, Exchange Rates and Demand for Money in Developing Economies: The Case of Nigeria, Ghana and Gambia

Authors: John Adebayo Oloyhede

Abstract:

This paper examines effect of financial liberalization on the stability of the demand for money function and its implication for exchange rate behaviour of three African countries. As the demand for money function is regarded as one of the two main building blocks of most exchange rate determination models, the other being purchasing power parity, its stability is required for the monetary models of exchange rate determination to hold. To what extent has the liberalisation policy of these countries, for instance liberalised interest rate, affected the demand for money function and what has been the consequence on the validity and relevance of floating exchange rate models? The study adopts the Autoregressive Instrumental Package (AIV) of multiple regression technique and followed the Almon Polynomial procedure with zero-end constraint. Data for the period 1986 to 2011 were drawn from three developing countries of Africa, namely: Gambia, Ghana and Nigeria, which did not only start the liberalization and floating system almost at the same period but share similar and diverse economic and financial structures. Its findings show that the demand for money was a stable function of income and interest rate at home and abroad. Other factors such as exchange rate and foreign interest rate exerted some significant effect on domestic money demand. The short-run and long-run elasticity with respect to income, interest rates, expected inflation rate and exchange rate expectation are not greater than zero. This evidence conforms to some extent to the expected behaviour of the domestic money function and underscores its ability to serve as good building block or assumption of the monetary model of exchange rate determination. This will, therefore, assist appropriate monetary authorities in the design and implementation of further financial liberalization policy packages in developing countries.

Keywords: financial liberalisation, exchange rates, demand for money, developing economies

Procedia PDF Downloads 367