Search results for: clustering on flowing data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8119

Search results for: clustering on flowing data

7909 A Parallel Computation Based on GPU Programming for a 3D Compressible Fluid Flow Simulation

Authors: Sugeng Rianto, P.W. Arinto Yudi, Soemarno Muhammad Nurhuda

Abstract:

A computation of a 3D compressible fluid flow for virtual environment with haptic interaction can be a non-trivial issue. This is especially how to reach good performances and balancing between visualization, tactile feedback interaction, and computations. In this paper, we describe our approach of computation methods based on parallel programming on a GPU. The 3D fluid flow solvers have been developed for smoke dispersion simulation by using combinations of the cubic interpolated propagation (CIP) based fluid flow solvers and the advantages of the parallelism and programmability of the GPU. The fluid flow solver is generated in the GPU-CPU message passing scheme to get rapid development of haptic feedback modes for fluid dynamic data. A rapid solution in fluid flow solvers is developed by applying cubic interpolated propagation (CIP) fluid flow solvers. From this scheme, multiphase fluid flow equations can be solved simultaneously. To get more acceleration in the computation, the Navier-Stoke Equations (NSEs) is packed into channels of texel, where computation models are performed on pixels that can be considered to be a grid of cells. Therefore, despite of the complexity of the obstacle geometry, processing on multiple vertices and pixels can be done simultaneously in parallel. The data are also shared in global memory for CPU to control the haptic in providing kinaesthetic interaction and felling. The results show that GPU based parallel computation approaches provide effective simulation of compressible fluid flow model for real-time interaction in 3D computer graphic for PC platform. This report has shown the feasibility of a new approach of solving the compressible fluid flow equations on the GPU. The experimental tests proved that the compressible fluid flowing on various obstacles with haptic interactions on the few model obstacles can be effectively and efficiently simulated on the reasonable frame rate with a realistic visualization. These results confirm that good performances and balancing between visualization, tactile feedback interaction, and computations can be applied successfully.

Keywords: CIP, compressible fluid, GPU programming, parallel computation, real-time visualisation

Procedia PDF Downloads 415
7908 3D Steady and Transient Centrifugal Pump Flow within Ansys CFX and OpenFOAM

Authors: Clement Leroy, Guillaume Boitel

Abstract:

This paper presents a comparative benchmarking review of a steady and transient three-dimensional (3D) flow computations in centrifugal pump using commercial (AnsysCFX) and open source (OpenFOAM) computational fluid dynamics (CFD) software. In centrifugal rotor-dynamic pump, the fluid enters in the impeller along to the rotating axis to be accelerated in order to increase the pressure, flowing radially outward into another stage, vaned diffuser or volute casing, from where it finally exits into a downstream pipe. Simulations are carried out at the best efficiency point (BEP) and part load, for single-phase flow with several turbulence models. The results are compared with overall performance report from experimental data. The use of CFD technology in industry is still limited by the high computational costs, and even more by the high cost of commercial CFD software and high-performance computing (HPC) licenses. The main objectives of the present study are to define OpenFOAM methodology for high-quality 3D steady and transient turbomachinery CFD simulation to conduct a thorough time-accurate performance analysis. On the other hand a detailed comparisons between computational methods, features on latest Ansys release 18 and OpenFOAM is investigated to assess the accuracy and industrial applications of those solvers. Finally an automated connected workflow (IoT) for turbine blade applications is presented.

Keywords: benchmarking, CFX, internet of things, openFOAM, time-accurate, turbomachinery

Procedia PDF Downloads 189
7907 Bioeconomic Modelling for Barramundi (Lates calcarifer) in Queensland: Implications for Recreational Fishing Following Recent Gill Netting Closures

Authors: Sabiha S. Marine, Nicole Flint, John Rolfe

Abstract:

The Queensland state government introduced commercial gill net fishing closures in Cairns, Mackay, and Rockhampton in November 2015 to increase the recreational fishing opportunities, nature-based tourism, and economic benefits in these three regional areas. This management change is likely to improve the potential for more desirable stock structures through natural recruitment. Barramundi (Lates calcarifer) is one of the popular target fish for recreational and commercial fishers in Northern Australia. This investigation examines the effects of reduced commercial fishing from both biological and economic perspectives, particularly on the local Barramundi population of the Fitzroy River in Rockhampton, the largest river catchment flowing to the eastern coast of Australia. Data on different parameters of biological and economic aspects have been collated from secondary sources for analysis through a system simulation approach to identify the effectiveness of the commercial netting closures on recreational fishing effort, especially for the Barramundi population. The results have the potential to explain certain consequences of the netting closures in Queensland, which could serve to inform future fisheries management decisions. The study output as a whole will help in the better management of fisheries resources by evaluating recreational fishing opportunities in Queensland, where the potential for increases in recreation is high.

Keywords: Barramundi, bioeconomic model, fishery management, recreational fishing

Procedia PDF Downloads 152
7906 Impact of Climate Change and Anthropogenic Effect on Hilsa Fishery Management in South-East Asia: Urgent Need for Trans-Boundary Policy

Authors: Dewan Ali Ahsan

Abstract:

Hilsa (Tenualosa ilisha) is one of the most important anadromous fish species of the trans-boundary ecosystem of Bangladesh, India and Myanmar. Hilsa is not only an economically important species specially for Bangladesh and India, but also for the integral part of the culture of the Bangladesh and India. This flag-ship species in Bangladesh contributed alone of 10.82% of the total fish production of the country and about 75% of world’s total catch of hilsa comes from Bangladesh alone. As hilsa is an anadromous fish, it migrates from the Bay of Bengal to rivers for spawning, nursing and growing and for all of these purposes hilsa needs freshwaters. Ripe broods prefer turbid, fast flowing freshwater for spawning but young prefer clear and slow flowing freshwater. Climate change (salinity intrusion, sea level rise, temperature rise, impact of fresh water flow), unplanned developmental activities and other anthropogenic activities all together are severely damaging the hilsa stock and its habitats. So, climate change and human interferences are predicted to have a range of direct and indirect impacts on marine and freshwater hilsa fishery, with implications for fisheries-dependent economies, coastal communities and fisherfolk. The present study identified that salinity intrusion, siltation in river bed, decrease water flow from upstream, fragmentation of river in dry season, over exploitation, use of small mesh nets are the major reasons to affect the upstream migration of hilsa and its sustainable management. It has been also noticed that Bangladesh government has taken some actions for hilsa management. Government is trying to increase hilsa production not only by conserving jatka (juvenile hilsa) but also protecting the brood hilsa during the breeding seasons by imposing seasonal ban on fishing, restricted mesh size etc. Unfortunately, no such management plans are available for Indian and Myanmar territory. As hilsa is a highly migratory trans-boundary fish in the Bay of Bengal (and all of these countries share the same stock), it is essential to adopt a joint management policy (by Bangladesh-India-Myanmar) for the sustainable management for the hilsa stock.

Keywords: hilsa, climate change, south-east Asia, fishery management

Procedia PDF Downloads 488
7905 Model Order Reduction for Frequency Response and Effect of Order of Method for Matching Condition

Authors: Aref Ghafouri, Mohammad javad Mollakazemi, Farhad Asadi

Abstract:

In this paper, model order reduction method is used for approximation in linear and nonlinearity aspects in some experimental data. This method can be used for obtaining offline reduced model for approximation of experimental data and can produce and follow the data and order of system and also it can match to experimental data in some frequency ratios. In this study, the method is compared in different experimental data and influence of choosing of order of the model reduction for obtaining the best and sufficient matching condition for following the data is investigated in format of imaginary and reality part of the frequency response curve and finally the effect and important parameter of number of order reduction in nonlinear experimental data is explained further.

Keywords: frequency response, order of model reduction, frequency matching condition, nonlinear experimental data

Procedia PDF Downloads 381
7904 An Empirical Study of the Impacts of Big Data on Firm Performance

Authors: Thuan Nguyen

Abstract:

In the present time, data to a data-driven knowledge-based economy is the same as oil to the industrial age hundreds of years ago. Data is everywhere in vast volumes! Big data analytics is expected to help firms not only efficiently improve performance but also completely transform how they should run their business. However, employing the emergent technology successfully is not easy, and assessing the roles of big data in improving firm performance is even much harder. There was a lack of studies that have examined the impacts of big data analytics on organizational performance. This study aimed to fill the gap. The present study suggested using firms’ intellectual capital as a proxy for big data in evaluating its impact on organizational performance. The present study employed the Value Added Intellectual Coefficient method to measure firm intellectual capital, via its three main components: human capital efficiency, structural capital efficiency, and capital employed efficiency, and then used the structural equation modeling technique to model the data and test the models. The financial fundamental and market data of 100 randomly selected publicly listed firms were collected. The results of the tests showed that only human capital efficiency had a significant positive impact on firm profitability, which highlighted the prominent human role in the impact of big data technology.

Keywords: big data, big data analytics, intellectual capital, organizational performance, value added intellectual coefficient

Procedia PDF Downloads 223
7903 Automated Test Data Generation For some types of Algorithm

Authors: Hitesh Tahbildar

Abstract:

The cost of test data generation for a program is computationally very high. In general case, no algorithm to generate test data for all types of algorithms has been found. The cost of generating test data for different types of algorithm is different. Till date, people are emphasizing the need to generate test data for different types of programming constructs rather than different types of algorithms. The test data generation methods have been implemented to find heuristics for different types of algorithms. Some algorithms that includes divide and conquer, backtracking, greedy approach, dynamic programming to find the minimum cost of test data generation have been tested. Our experimental results say that some of these types of algorithm can be used as a necessary condition for selecting heuristics and programming constructs are sufficient condition for selecting our heuristics. Finally we recommend the different heuristics for test data generation to be selected for different types of algorithms.

Keywords: ongest path, saturation point, lmax, kL, kS

Procedia PDF Downloads 387
7902 The Perspective on Data Collection Instruments for Younger Learners

Authors: Hatice Kübra Koç

Abstract:

For academia, collecting reliable and valid data is one of the most significant issues for researchers. However, it is not the same procedure for all different target groups; meanwhile, during data collection from teenagers, young adults, or adults, researchers can use common data collection tools such as questionnaires, interviews, and semi-structured interviews; yet, for young learners and very young ones, these reliable and valid data collection tools cannot be easily designed or applied by the researchers. In this study, firstly, common data collection tools are examined for ‘very young’ and ‘young learners’ participant groups since it is thought that the quality and efficiency of an academic study is mainly based on its valid and correct data collection and data analysis procedure. Secondly, two different data collection instruments for very young and young learners are stated as discussing the efficacy of them. Finally, a suggested data collection tool – a performance-based questionnaire- which is specifically developed for ‘very young’ and ‘young learners’ participant groups in the field of teaching English to young learners as a foreign language is presented in this current study. The designing procedure and suggested items/factors for the suggested data collection tool are accordingly revealed at the end of the study to help researchers have studied with young and very learners.

Keywords: data collection instruments, performance-based questionnaire, young learners, very young learners

Procedia PDF Downloads 68
7901 Generating Swarm Satellite Data Using Long Short-Term Memory and Generative Adversarial Networks for the Detection of Seismic Precursors

Authors: Yaxin Bi

Abstract:

Accurate prediction and understanding of the evolution mechanisms of earthquakes remain challenging in the fields of geology, geophysics, and seismology. This study leverages Long Short-Term Memory (LSTM) networks and Generative Adversarial Networks (GANs), a generative model tailored to time-series data, for generating synthetic time series data based on Swarm satellite data, which will be used for detecting seismic anomalies. LSTMs demonstrated commendable predictive performance in generating synthetic data across multiple countries. In contrast, the GAN models struggled to generate synthetic data, often producing non-informative values, although they were able to capture the data distribution of the time series. These findings highlight both the promise and challenges associated with applying deep learning techniques to generate synthetic data, underscoring the potential of deep learning in generating synthetic electromagnetic satellite data.

Keywords: LSTM, GAN, earthquake, synthetic data, generative AI, seismic precursors

Procedia PDF Downloads 12
7900 Generation of Quasi-Measurement Data for On-Line Process Data Analysis

Authors: Hyun-Woo Cho

Abstract:

For ensuring the safety of a manufacturing process one should quickly identify an assignable cause of a fault in an on-line basis. To this end, many statistical techniques including linear and nonlinear methods have been frequently utilized. However, such methods possessed a major problem of small sample size, which is mostly attributed to the characteristics of empirical models used for reference models. This work presents a new method to overcome the insufficiency of measurement data in the monitoring and diagnosis tasks. Some quasi-measurement data are generated from existing data based on the two indices of similarity and importance. The performance of the method is demonstrated using a real data set. The results turn out that the presented methods are able to handle the insufficiency problem successfully. In addition, it is shown to be quite efficient in terms of computational speed and memory usage, and thus on-line implementation of the method is straightforward for monitoring and diagnosis purposes.

Keywords: data analysis, diagnosis, monitoring, process data, quality control

Procedia PDF Downloads 460
7899 Formulation of Optimal Shifting Sequence for Multi-Speed Automatic Transmission

Authors: Sireesha Tamada, Debraj Bhattacharjee, Pranab K. Dan, Prabha Bhola

Abstract:

The most important component in an automotive transmission system is the gearbox which controls the speed of the vehicle. In an automatic transmission, the right positioning of actuators ensures efficient transmission mechanism embodiment, wherein the challenge lies in formulating the number of actuators associated with modelling a gearbox. Data with respect to actuation and gear shifting sequence has been retrieved from the available literature, including patent documents, and has been used in this proposed heuristics based methodology for modelling actuation sequence in a gear box. This paper presents a methodological approach in designing a gearbox for the purpose of obtaining an optimal shifting sequence. The computational model considers factors namely, the number of stages and gear teeth as input parameters since these two are the determinants of the gear ratios in an epicyclic gear train. The proposed transmission schematic or stick diagram aids in developing the gearbox layout design. The number of iterations and development time required to design a gearbox layout is reduced by using this approach.

Keywords: automatic transmission, gear-shifting, multi-stage planetary gearbox, rank ordered clustering

Procedia PDF Downloads 305
7898 Vibration Analysis of Pendulum in a Viscous Fluid by Analytical Methods

Authors: Arash Jafari, Mehdi Taghaddosi, Azin Parvin

Abstract:

In this study, a vibrational differential equation governing on swinging single-degree-of-freedom pendulum in a viscous fluid has been investigated. The damping process is characterized according to two different regimes: at first, damping in stationary viscous fluid, in the second, damping in flowing viscous fluid with constant velocity. Our purpose is to enhance the ability of solving the mentioned nonlinear differential equation with a simple and innovative approach. Comparisons are made between new method and Numerical Method (rkf45). The results show that this method is very effective and simple and can be applied for other nonlinear problems.

Keywords: oscillating systems, angular frequency and damping ratio, pendulum at fluid, locus of maximum

Procedia PDF Downloads 324
7897 Bag of Local Features for Person Re-Identification on Large-Scale Datasets

Authors: Yixiu Liu, Yunzhou Zhang, Jianning Chi, Hao Chu, Rui Zheng, Libo Sun, Guanghao Chen, Fangtong Zhou

Abstract:

In the last few years, large-scale person re-identification has attracted a lot of attention from video surveillance since it has a potential application prospect in public safety management. However, it is still a challenging job considering the variation in human pose, the changing illumination conditions and the lack of paired samples. Although the accuracy has been significantly improved, the data dependence of the sample training is serious. To tackle this problem, a new strategy is proposed based on bag of visual words (BoVW) model of designing the feature representation which has been widely used in the field of image retrieval. The local features are extracted, and more discriminative feature representation is obtained by cross-view dictionary learning (CDL), then the assignment map is obtained through k-means clustering. Finally, the BoVW histograms are formed which encodes the images with the statistics of the feature classes in the assignment map. Experiments conducted on the CUHK03, Market1501 and MARS datasets show that the proposed method performs favorably against existing approaches.

Keywords: bag of visual words, cross-view dictionary learning, person re-identification, reranking

Procedia PDF Downloads 174
7896 Ethics Can Enable Open Source Data Research

Authors: Dragana Calic

Abstract:

The openness, availability and the sheer volume of big data have provided, what some regard as, an invaluable and rich dataset. Researchers, businesses, advertising agencies, medical institutions, to name only a few, collect, share, and analyze this data to enable their processes and decision making. However, there are important ethical considerations associated with the use of big data. The rapidly evolving nature of online technologies has overtaken the many legislative, privacy, and ethical frameworks and principles that exist. For example, should we obtain consent to use people’s online data, and under what circumstances can privacy considerations be overridden? Current guidance on how to appropriately and ethically handle big data is inconsistent. Consequently, this paper focuses on two quite distinct but related ethical considerations that are at the core of the use of big data for research purposes. They include empowering the producers of data and empowering researchers who want to study big data. The first consideration focuses on informed consent which is at the core of empowering producers of data. In this paper, we discuss some of the complexities associated with informed consent and consider studies of producers’ perceptions to inform research ethics guidelines and practice. The second consideration focuses on the researcher. Similarly, we explore studies that focus on researchers’ perceptions and experiences.

Keywords: big data, ethics, producers’ perceptions, researchers’ perceptions

Procedia PDF Downloads 271
7895 Bubble Growth in a Two Phase Upward Flow in a Miniature Tube

Authors: R. S. Hassani, S. Chikh, L. Tadrist, S. Radev

Abstract:

A bubbly flow in a vertical miniature tube is analyzed theoretically. The liquid and gas phase are co-current flowing upward. The gas phase is injected via a nozzle whose inner diameter is 0.11mm and it is placed on the axis of the tube. A force balance is applied on the bubble at its detachment. The set of governing equations are solved by use of Mathematica software. The bubble diameter and the bubble generation frequency are determined for various inlet phase velocities represented by the inlet mass quality. The results show different behavior of bubble growth and detachment depending on the tube size.

Keywords: two phase flow, bubble growth, mini-channel, generation frequency

Procedia PDF Downloads 421
7894 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: data mining, knowledge discovery, machine learning, similarity measurement, supervised classification

Procedia PDF Downloads 449
7893 PEINS: A Generic Compression Scheme Using Probabilistic Encoding and Irrational Number Storage

Authors: P. Jayashree, S. Rajkumar

Abstract:

With social networks and smart devices generating a multitude of data, effective data management is the need of the hour for networks and cloud applications. Some applications need effective storage while some other applications need effective communication over networks and data reduction comes as a handy solution to meet out both requirements. Most of the data compression techniques are based on data statistics and may result in either lossy or lossless data reductions. Though lossy reductions produce better compression ratios compared to lossless methods, many applications require data accuracy and miniature details to be preserved. A variety of data compression algorithms does exist in the literature for different forms of data like text, image, and multimedia data. In the proposed work, a generic progressive compression algorithm, based on probabilistic encoding, called PEINS is projected as an enhancement over irrational number stored coding technique to cater to storage issues of increasing data volumes as a cost effective solution, which also offers data security as a secondary outcome to some extent. The proposed work reveals cost effectiveness in terms of better compression ratio with no deterioration in compression time.

Keywords: compression ratio, generic compression, irrational number storage, probabilistic encoding

Procedia PDF Downloads 271
7892 Comparison of Selected Pier-Scour Equations for Wide Piers Using Field Data

Authors: Nordila Ahmad, Thamer Mohammad, Bruce W. Melville, Zuliziana Suif

Abstract:

Current methods for predicting local scour at wide bridge piers, were developed on the basis of laboratory studies and very limited scour prediction were tested with field data. Laboratory wide pier scour equation from previous findings with field data were presented. A wide range of field data were used and it consists of both live-bed and clear-water scour. A method for assessing the quality of the data was developed and applied to the data set. Three other wide pier-scour equations from the literature were used to compare the performance of each predictive method. The best-performing scour equation were analyzed using statistical analysis. Comparisons of computed and observed scour depths indicate that the equation from the previous publication produced the smallest discrepancy ratio and RMSE value when compared with the large amount of laboratory and field data.

Keywords: field data, local scour, scour equation, wide piers

Procedia PDF Downloads 387
7891 The Maximum Throughput Analysis of UAV Datalink 802.11b Protocol

Authors: Inkyu Kim, SangMan Moon

Abstract:

This IEEE 802.11b protocol provides up to 11Mbps data rate, whereas aerospace industry wants to seek higher data rate COTS data link system in the UAV. The Total Maximum Throughput (TMT) and delay time are studied on many researchers in the past years This paper provides theoretical data throughput performance of UAV formation flight data link using the existing 802.11b performance theory. We operate the UAV formation flight with more than 30 quad copters with 802.11b protocol. We may be predicting that UAV formation flight numbers have to bound data link protocol performance limitations.

Keywords: UAV datalink, UAV formation flight datalink, UAV WLAN datalink application, UAV IEEE 802.11b datalink application

Procedia PDF Downloads 370
7890 Methods for Distinction of Cattle Using Supervised Learning

Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl

Abstract:

Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.

Keywords: genetic data, Pinzgau cattle, supervised learning, machine learning

Procedia PDF Downloads 532
7889 Unsteady and Steady State in Natural Convection

Authors: Syukri Himran, Erwin Eka Putra, Nanang Roni

Abstract:

This study explains the natural convection of viscous fluid flowing on semi-infinite vertical plate. A set of the governing equations describing the continuity, momentum and energy, have been reduced to dimensionless forms by introducing the references variables. To solve the problems, the equations are formulated by explicit finite-difference in time dependent form and computations are performed by Fortran program. The results describe velocity, temperature profiles both in transient and steady state conditions. An approximate value of heat transfer coefficient and the effects of Pr on convection flow are also presented.

Keywords: natural convection, vertical plate, velocity and temperature profiles, steady and unsteady

Procedia PDF Downloads 473
7888 Noise Reduction in Web Data: A Learning Approach Based on Dynamic User Interests

Authors: Julius Onyancha, Valentina Plekhanova

Abstract:

One of the significant issues facing web users is the amount of noise in web data which hinders the process of finding useful information in relation to their dynamic interests. Current research works consider noise as any data that does not form part of the main web page and propose noise web data reduction tools which mainly focus on eliminating noise in relation to the content and layout of web data. This paper argues that not all data that form part of the main web page is of a user interest and not all noise data is actually noise to a given user. Therefore, learning of noise web data allocated to the user requests ensures not only reduction of noisiness level in a web user profile, but also a decrease in the loss of useful information hence improves the quality of a web user profile. Noise Web Data Learning (NWDL) tool/algorithm capable of learning noise web data in web user profile is proposed. The proposed work considers elimination of noise data in relation to dynamic user interest. In order to validate the performance of the proposed work, an experimental design setup is presented. The results obtained are compared with the current algorithms applied in noise web data reduction process. The experimental results show that the proposed work considers the dynamic change of user interest prior to elimination of noise data. The proposed work contributes towards improving the quality of a web user profile by reducing the amount of useful information eliminated as noise.

Keywords: web log data, web user profile, user interest, noise web data learning, machine learning

Procedia PDF Downloads 246
7887 Data Mining and Knowledge Management Application to Enhance Business Operations: An Exploratory Study

Authors: Zeba Mahmood

Abstract:

The modern business organizations are adopting technological advancement to achieve competitive edge and satisfy their consumer. The development in the field of Information technology systems has changed the way of conducting business today. Business operations today rely more on the data they obtained and this data is continuously increasing in volume. The data stored in different locations is difficult to find and use without the effective implementation of Data mining and Knowledge management techniques. Organizations who smartly identify, obtain and then convert data in useful formats for their decision making and operational improvements create additional value for their customers and enhance their operational capabilities. Marketers and Customer relationship departments of firm use Data mining techniques to make relevant decisions, this paper emphasizes on the identification of different data mining and Knowledge management techniques that are applied to different business industries. The challenges and issues of execution of these techniques are also discussed and critically analyzed in this paper.

Keywords: knowledge, knowledge management, knowledge discovery in databases, business, operational, information, data mining

Procedia PDF Downloads 513
7886 Analyzing Large Scale Recurrent Event Data with a Divide-And-Conquer Approach

Authors: Jerry Q. Cheng

Abstract:

Currently, in analyzing large-scale recurrent event data, there are many challenges such as memory limitations, unscalable computing time, etc. In this research, a divide-and-conquer method is proposed using parametric frailty models. Specifically, the data is randomly divided into many subsets, and the maximum likelihood estimator from each individual data set is obtained. Then a weighted method is proposed to combine these individual estimators as the final estimator. It is shown that this divide-and-conquer estimator is asymptotically equivalent to the estimator based on the full data. Simulation studies are conducted to demonstrate the performance of this proposed method. This approach is applied to a large real dataset of repeated heart failure hospitalizations.

Keywords: big data analytics, divide-and-conquer, recurrent event data, statistical computing

Procedia PDF Downloads 144
7885 Adoption of Big Data by Global Chemical Industries

Authors: Ashiff Khan, A. Seetharaman, Abhijit Dasgupta

Abstract:

The new era of big data (BD) is influencing chemical industries tremendously, providing several opportunities to reshape the way they operate and help them shift towards intelligent manufacturing. Given the availability of free software and the large amount of real-time data generated and stored in process plants, chemical industries are still in the early stages of big data adoption. The industry is just starting to realize the importance of the large amount of data it owns to make the right decisions and support its strategies. This article explores the importance of professional competencies and data science that influence BD in chemical industries to help it move towards intelligent manufacturing fast and reliable. This article utilizes a literature review and identifies potential applications in the chemical industry to move from conventional methods to a data-driven approach. The scope of this document is limited to the adoption of BD in chemical industries and the variables identified in this article. To achieve this objective, government, academia, and industry must work together to overcome all present and future challenges.

Keywords: chemical engineering, big data analytics, industrial revolution, professional competence, data science

Procedia PDF Downloads 66
7884 The Rational Mode of Affordable Housing Based on the Special Residence Space Form of City Village in Xiamen

Authors: Pingrong Liao

Abstract:

Currently, as China is in the stage of rapid urbanization, a large number of rural population have flown into the city and it is urgent to solve the housing problem. Xiamen is the typical city of China characterized by high housing price and low-income. Due to the government failed to provide adequate public cheap housing, a large number of immigrants dwell in the informal rental housing represented by the "city village". Comfortable housing is the prerequisite for the harmony and stability of the city. Therefore, with "city village" and the affordable housing as the main object of study, this paper makes an analysis on the housing status, personnel distribution and mobility of the "city village" of Xiamen, and also carries out a primary research on basic facilities such as the residential form and commercial, property management services, with the combination of the existing status of the affordable housing in Xiamen, and finally summary and comparison are made by the author in an attempt to provide some references and experience for the construction and improvement of the government-subsidized housing to improve the residential quality of the urban-poverty stricken people. In this paper, the data and results are collated and quantified objectively based on the relevant literature, the latest market data and practical investigation as well as research methods of comparative study and case analysis. Informal rental housing, informal economy and informal management of "city village" as social-housing units in many ways fit in the housing needs of the floating population, providing a convenient and efficient condition for the flowing of people. However, the existing urban housing in Xiamen have some drawbacks, for example, the housing are unevenly distributed, the spatial form is single, the allocation standard of public service facilities is not targeted to the subsidized object, the property management system is imperfect and the cost is too high, therefore, this paper draws lessons from the informal model of city village”, and finally puts forward some improvement strategies.

Keywords: urban problem, urban village, affordable housing, living mode, Xiamen constructing

Procedia PDF Downloads 227
7883 Secure Multiparty Computations for Privacy Preserving Classifiers

Authors: M. Sumana, K. S. Hareesha

Abstract:

Secure computations are essential while performing privacy preserving data mining. Distributed privacy preserving data mining involve two to more sites that cannot pool in their data to a third party due to the violation of law regarding the individual. Hence in order to model the private data without compromising privacy and information loss, secure multiparty computations are used. Secure computations of product, mean, variance, dot product, sigmoid function using the additive and multiplicative homomorphic property is discussed. The computations are performed on vertically partitioned data with a single site holding the class value.

Keywords: homomorphic property, secure product, secure mean and variance, secure dot product, vertically partitioned data

Procedia PDF Downloads 399
7882 Cross Project Software Fault Prediction at Design Phase

Authors: Pradeep Singh, Shrish Verma

Abstract:

Software fault prediction models are created by using the source code, processed metrics from the same or previous version of code and related fault data. Some company do not store and keep track of all artifacts which are required for software fault prediction. To construct fault prediction model for such company, the training data from the other projects can be one potential solution. The earlier we predict the fault the less cost it requires to correct. The training data consists of metrics data and related fault data at function/module level. This paper investigates fault predictions at early stage using the cross-project data focusing on the design metrics. In this study, empirical analysis is carried out to validate design metrics for cross project fault prediction. The machine learning techniques used for evaluation is Naïve Bayes. The design phase metrics of other projects can be used as initial guideline for the projects where no previous fault data is available. We analyze seven data sets from NASA Metrics Data Program which offer design as well as code metrics. Overall, the results of cross project is comparable to the within company data learning.

Keywords: software metrics, fault prediction, cross project, within project.

Procedia PDF Downloads 324
7881 Multimedia Data Fusion for Event Detection in Twitter by Using Dempster-Shafer Evidence Theory

Authors: Samar M. Alqhtani, Suhuai Luo, Brian Regan

Abstract:

Data fusion technology can be the best way to extract useful information from multiple sources of data. It has been widely applied in various applications. This paper presents a data fusion approach in multimedia data for event detection in twitter by using Dempster-Shafer evidence theory. The methodology applies a mining algorithm to detect the event. There are two types of data in the fusion. The first is features extracted from text by using the bag-ofwords method which is calculated using the term frequency-inverse document frequency (TF-IDF). The second is the visual features extracted by applying scale-invariant feature transform (SIFT). The Dempster - Shafer theory of evidence is applied in order to fuse the information from these two sources. Our experiments have indicated that comparing to the approaches using individual data source, the proposed data fusion approach can increase the prediction accuracy for event detection. The experimental result showed that the proposed method achieved a high accuracy of 0.97, comparing with 0.93 with texts only, and 0.86 with images only.

Keywords: data fusion, Dempster-Shafer theory, data mining, event detection

Procedia PDF Downloads 391
7880 Adaptive Data Approximations Codec (ADAC) for AI/ML-based Cyber-Physical Systems

Authors: Yong-Kyu Jung

Abstract:

The fast growth in information technology has led to de-mands to access/process data. CPSs heavily depend on the time of hardware/software operations and communication over the network (i.e., real-time/parallel operations in CPSs (e.g., autonomous vehicles). Since data processing is an im-portant means to overcome the issue confronting data management, reducing the gap between the technological-growth and the data-complexity and channel-bandwidth. An adaptive perpetual data approximation method is intro-duced to manage the actual entropy of the digital spectrum. An ADAC implemented as an accelerator and/or apps for servers/smart-connected devices adaptively rescales digital contents (avg.62.8%), data processing/access time/energy, encryption/decryption overheads in AI/ML applications (facial ID/recognition).

Keywords: adaptive codec, AI, ML, HPC, cyber-physical, cybersecurity

Procedia PDF Downloads 65