Search results for: big data computation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24471

Search results for: big data computation

24291 An Interpolation Tool for Data Transfer in Two-Dimensional Ice Accretion Problems

Authors: Marta Cordero-Gracia, Mariola Gomez, Olivier Blesbois, Marina Carrion

Abstract:

One of the difficulties in icing simulations is for extended periods of exposure, when very large ice shapes are created. As well as being large, they can have complex shapes, such as a double horn. For icing simulations, these configurations are currently computed in several steps. The icing step is stopped when the ice shapes become too large, at which point a new mesh has to be created to allow for further CFD and ice growth simulations to be performed. This can be very costly, and is a limiting factor in the simulations that can be performed. A way to avoid the costly human intervention in the re-meshing step of multistep icing computation is to use mesh deformation instead of re-meshing. The aim of the present work is to apply an interpolation method based on Radial Basis Functions (RBF) to transfer deformations from surface mesh to volume mesh. This deformation tool has been developed specifically for icing problems. It is able to deal with localized, sharp and large deformations, unlike the tools traditionally used for more smooth wing deformations. This tool will be presented along with validation on typical two-dimensional icing shapes.

Keywords: ice accretion, interpolation, mesh deformation, radial basis functions

Procedia PDF Downloads 284
24290 Investigations of Flow Field with Different Turbulence Models on NREL Phase VI Blade

Authors: T. Y. Liu, C. H. Lin, Y. M. Ferng

Abstract:

Wind energy is one of the clean renewable energy. However, the low frequency (20-200HZ) noise generated from the wind turbine blades, which bothers the residents, becomes the major problem to be developed. It is useful for predicting the aerodynamic noise by flow field and pressure distribution analysis on the wind turbine blades. Therefore, the main objective of this study is to use different turbulence models to analyse the flow field and pressure distributions of the wing blades. Three-dimensional Computation Fluid Dynamics (CFD) simulation of the flow field was used to calculate the flow phenomena for the National Renewable Energy Laboratory (NREL) Phase VI horizontal axis wind turbine rotor. Two different flow cases with different wind speeds were investigated: 7m/s with 72rpm and 15m/s with 72rpm. Four kinds of RANS-based turbulence models, Standard k-ε, Realizable k-ε, SST k-ω, and v2f, were used to predict and analyse the results in the present work. The results show that the predictions on pressure distributions with SST k-ω and v2f turbulence models have good agreements with experimental data.

Keywords: horizontal axis wind turbine, turbulence model, noise, fluid dynamics

Procedia PDF Downloads 236
24289 Multichannel Scheme under Fairness Environment for Cognitive Radio Networks

Authors: Hans Marquez Ramos, Cesar Hernandez, Ingrid Páez

Abstract:

This paper develops a multiple channel assignment model, which allows to take advantage in most efficient way, spectrum opportunities in cognitive radio networks. Developed scheme allows make several available and frequency adjacent channel assignments, which require a bigger wide band, under an equality environment. The hybrid assignment model it is made by to algorithms, one who makes the ranking and select available frequency channels and the other one in charge of establishing an equality criteria, in order to not restrict spectrum opportunities for all other secondary users who wish to make transmissions. Measurements made were done for average bandwidth, average delay, as well fairness computation for several channel assignment. Reached results were evaluated with experimental spectrum occupational data from GSM frequency band captured. Developed model, shows evidence of improvement in spectrum opportunity use and a wider average transmit bandwidth for each secondary user, maintaining equality criteria in channel assignment.

Keywords: bandwidth, fairness, multichannel, secondary users

Procedia PDF Downloads 467
24288 High Performance Computing and Big Data Analytics

Authors: Branci Sarra, Branci Saadia

Abstract:

Because of the multiplied data growth, many computer science tools have been developed to process and analyze these Big Data. High-performance computing architectures have been designed to meet the treatment needs of Big Data (view transaction processing standpoint, strategic, and tactical analytics). The purpose of this article is to provide a historical and global perspective on the recent trend of high-performance computing architectures especially what has a relation with Analytics and Data Mining.

Keywords: high performance computing, HPC, big data, data analysis

Procedia PDF Downloads 484
24287 A Landscape of Research Data Repositories in Re3data.org Registry: A Case Study of Indian Repositories

Authors: Prashant Shrivastava

Abstract:

The purpose of this study is to explore re3dat.org registry to identify research data repositories registration workflow process. Further objective is to depict a graph for present development of research data repositories in India. Preliminarily with an approach to understand re3data.org registry framework and schema design then further proceed to explore the status of research data repositories of India in re3data.org registry. Research data repositories are getting wider relevance due to e-research concepts. Now available registry re3data.org is a good tool for users and researchers to identify appropriate research data repositories as per their research requirements. In Indian environment, a compatible National Research Data Policy is the need of the time to boost the management of research data. Registry for Research Data Repositories is a crucial tool to discover specific information in specific domain. Also, Research Data Repositories in India have not been studied. Re3data.org registry and status of Indian research data repositories both discussed in this study.

Keywords: research data, research data repositories, research data registry, re3data.org

Procedia PDF Downloads 295
24286 A Study of Cloud Computing Solution for Transportation Big Data Processing

Authors: Ilgin Gökaşar, Saman Ghaffarian

Abstract:

The need for fast processed big data of transportation ridership (eg., smartcard data) and traffic operation (e.g., traffic detectors data) which requires a lot of computational power is incontrovertible in Intelligent Transportation Systems. Nowadays cloud computing is one of the important subjects and popular information technology solution for data processing. It enables users to process enormous measure of data without having their own particular computing power. Thus, it can also be a good selection for transportation big data processing as well. This paper intends to examine how the cloud computing can enhance transportation big data process with contrasting its advantages and disadvantages, and discussing cloud computing features.

Keywords: big data, cloud computing, Intelligent Transportation Systems, ITS, traffic data processing

Procedia PDF Downloads 422
24285 Harmonic Data Preparation for Clustering and Classification

Authors: Ali Asheibi

Abstract:

The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.

Keywords: data mining, harmonic data, clustering, classification

Procedia PDF Downloads 218
24284 Linguistic Summarization of Structured Patent Data

Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay

Abstract:

Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.

Keywords: data mining, fuzzy sets, linguistic summarization, patent data

Procedia PDF Downloads 245
24283 Proposal of Data Collection from Probes

Authors: M. Kebisek, L. Spendla, M. Kopcek, T. Skulavik

Abstract:

In our paper we describe the security capabilities of data collection. Data are collected with probes located in the near and distant surroundings of the company. Considering the numerous obstacles e.g. forests, hills, urban areas, the data collection is realized in several ways. The collection of data uses connection via wireless communication, LAN network, GSM network and in certain areas data are collected by using vehicles. In order to ensure the connection to the server most of the probes have ability to communicate in several ways. Collected data are archived and subsequently used in supervisory applications. To ensure the collection of the required data, it is necessary to propose algorithms that will allow the probes to select suitable communication channel.

Keywords: communication, computer network, data collection, probe

Procedia PDF Downloads 331
24282 A Review on Big Data Movement with Different Approaches

Authors: Nay Myo Sandar

Abstract:

With the growth of technologies and applications, a large amount of data has been producing at increasing rate from various resources such as social media networks, sensor devices, and other information serving devices. This large collection of massive, complex and exponential growth of dataset is called big data. The traditional database systems cannot store and process such data due to large and complexity. Consequently, cloud computing is a potential solution for data storage and processing since it can provide a pool of resources for servers and storage. However, moving large amount of data to and from is a challenging issue since it can encounter a high latency due to large data size. With respect to big data movement problem, this paper reviews the literature of previous works, discusses about research issues, finds out approaches for dealing with big data movement problem.

Keywords: Big Data, Cloud Computing, Big Data Movement, Network Techniques

Procedia PDF Downloads 53
24281 Optimized Approach for Secure Data Sharing in Distributed Database

Authors: Ahmed Mateen, Zhu Qingsheng, Ahmad Bilal

Abstract:

In the current age of technology, information is the most precious asset of a company. Today, companies have a large amount of data. As the data become larger, access to data for some particular information is becoming slower day by day. Faster data processing to shape it in the form of information is the biggest issue. The major problems in distributed databases are the efficiency of data distribution and response time of data distribution. The security of data distribution is also a big issue. For these problems, we proposed a strategy that can maximize the efficiency of data distribution and also increase its response time. This technique gives better results for secure data distribution from multiple heterogeneous sources. The newly proposed technique facilitates the companies for secure data sharing efficiently and quickly.

Keywords: ER-schema, electronic record, P2P framework, API, query formulation

Procedia PDF Downloads 300
24280 Vulnerability of People to Climate Change: Influence of Methods and Computation Approaches on Assessment Outcomes

Authors: Adandé Belarmain Fandohan

Abstract:

Climate change has become a major concern globally, particularly in rural communities that have to find rapid coping solutions. Several vulnerability assessment approaches have been developed in the last decades. This comes along with a higher risk for different methods to result in different conclusions, thereby making comparisons difficult and decision-making non-consistent across areas. The effect of methods and computational approaches on estimates of people’s vulnerability was assessed using data collected from the Gambia. Twenty-four indicators reflecting vulnerability components: (exposure, sensitivity, and adaptive capacity) were selected for this purpose. Data were collected through household surveys and key informant interviews. One hundred and fifteen respondents were surveyed across six communities and two administrative districts. Results were compared over three computational approaches: the maximum value transformation normalization, the z-score transformation normalization, and simple averaging. Regardless of the approaches used, communities that have high exposure to climate change and extreme events were the most vulnerable. Furthermore, the vulnerability was strongly related to the socio-economic characteristics of farmers. The survey evidenced variability in vulnerability among communities and administrative districts. Comparing output across approaches, overall, people in the study area were found to be highly vulnerable using the simple average and maximum value transformation, whereas they were only moderately vulnerable using the z-score transformation approach. It is suggested that assessment approach-induced discrepancies be accounted for in international debates to harmonize/standardize assessment approaches to the end of making outputs comparable across regions. This will also likely increase the relevance of decision-making for adaptation policies.

Keywords: maximum value transformation, simple averaging, vulnerability assessment, West Africa, z-score transformation

Procedia PDF Downloads 78
24279 Using Derivative Free Method to Improve the Error Estimation of Numerical Quadrature

Authors: Chin-Yun Chen

Abstract:

Numerical integration is an essential tool for deriving different physical quantities in engineering and science. The effectiveness of a numerical integrator depends on different factors, where the crucial one is the error estimation. This work presents an error estimator that combines a derivative free method to improve the performance of verified numerical quadrature.

Keywords: numerical quadrature, error estimation, derivative free method, interval computation

Procedia PDF Downloads 431
24278 Improvement in Quality-Factor Superconducting Co-Planer Waveguide Resonators by Passivation Air-Interfaces Using Self-Assembled Monolayers

Authors: Saleem Rao, Mohammed Al-Ghadeer, Archan Banerjee, Hossein Fariborzi

Abstract:

Materials imperfection, particularly two-level-system (TLS) defects in planer superconducting quantum circuits, contributes significantly to decoherence, ultimately limiting the performance of quantum computation and sensing. Oxides at air interfaces are among the host of TLS, and different material has been used to reduce TLS losses. Passivation with an inorganic layer is not an option to reduce these interface oxides; however, they can be etched away, but their regrowth remains a problem. Here, we report the chemisorption of molecular self-assembled monolayers (SAMs) at air interfaces of superconducting co-planer waveguide (CPW) resonators that suppress the regrowth of oxides and also modify the dielectric constant of the interface. With SAMs, we observed sustained order of magnitude improvement in quality factor -better than oxide etched interfaces. Quality factor measurements at millikelvin temperature and at single photon, XPS data, and TEM images of SAM passivated air interface sustenance our claim. Compatibility of SAM with micro-/nano-fabrication processes opens new ways to improve the coherence time in cQED.

Keywords: superconducting circuits, quality-factor, self-assembled monolayer, coherence

Procedia PDF Downloads 40
24277 Computing Continuous Skyline Queries without Discriminating between Static and Dynamic Attributes

Authors: Ibrahim Gomaa, Hoda M. O. Mokhtar

Abstract:

Although most of the existing skyline queries algorithms focused basically on querying static points through static databases; with the expanding number of sensors, wireless communications and mobile applications, the demand for continuous skyline queries has increased. Unlike traditional skyline queries which only consider static attributes, continuous skyline queries include dynamic attributes, as well as the static ones. However, as skyline queries computation is based on checking the domination of skyline points over all dimensions, considering both the static and dynamic attributes without separation is required. In this paper, we present an efficient algorithm for computing continuous skyline queries without discriminating between static and dynamic attributes. Our algorithm in brief proceeds as follows: First, it excludes the points which will not be in the initial skyline result; this pruning phase reduces the required number of comparisons. Second, the association between the spatial positions of data points is examined; this phase gives an idea of where changes in the result might occur and consequently enables us to efficiently update the skyline result (continuous update) rather than computing the skyline from scratch. Finally, experimental evaluation is provided which demonstrates the accuracy, performance and efficiency of our algorithm over other existing approaches.

Keywords: continuous query processing, dynamic database, moving object, skyline queries

Procedia PDF Downloads 191
24276 Application of the Standard Deviation in Regulating Design Variation of Urban Solutions Generated through Evolutionary Computation

Authors: Mohammed Makki, Milad Showkatbakhsh, Aiman Tabony

Abstract:

Computational applications of natural evolutionary processes as problem-solving tools have been well established since the mid-20th century. However, their application within architecture and design has only gained ground in recent years, with an increasing number of academics and professionals in the field electing to utilize evolutionary computation to address problems comprised from multiple conflicting objectives with no clear optimal solution. Recent advances in computer science and its consequent constructive influence on the architectural discourse has led to the emergence of multiple algorithmic processes capable of simulating the evolutionary process in nature within an efficient timescale. Many of the developed processes of generating a population of candidate solutions to a design problem through an evolutionary based stochastic search process are often driven through the application of both environmental and architectural parameters. These methods allow for conflicting objectives to be simultaneously, independently, and objectively optimized. This is an essential approach in design problems with a final product that must address the demand of a multitude of individuals with various requirements. However, one of the main challenges encountered through the application of an evolutionary process as a design tool is the ability for the simulation to maintain variation amongst design solutions in the population while simultaneously increasing in fitness. This is most commonly known as the ‘golden rule’ of balancing exploration and exploitation over time; the difficulty of achieving this balance in the simulation is due to the tendency of either variation or optimization being favored as the simulation progresses. In such cases, the generated population of candidate solutions has either optimized very early in the simulation, or has continued to maintain high levels of variation to which an optimal set could not be discerned; thus, providing the user with a solution set that has not evolved efficiently to the objectives outlined in the problem at hand. As such, the experiments presented in this paper seek to achieve the ‘golden rule’ by incorporating a mathematical fitness criterion for the development of an urban tissue comprised from the superblock as its primary architectural element. The mathematical value investigated in the experiments is the standard deviation factor. Traditionally, the standard deviation factor has been used as an analytical value rather than a generative one, conventionally used to measure the distribution of variation within a population by calculating the degree by which the majority of the population deviates from the mean. A higher standard deviation value delineates a higher number of the population is clustered around the mean and thus limited variation within the population, while a lower standard deviation value is due to greater variation within the population and a lack of convergence towards an optimal solution. The results presented will aim to clarify the extent to which the utilization of the standard deviation factor as a fitness criterion can be advantageous to generating fitter individuals in a more efficient timeframe when compared to conventional simulations that only incorporate architectural and environmental parameters.

Keywords: architecture, computation, evolution, standard deviation, urban

Procedia PDF Downloads 108
24275 Data Mining Algorithms Analysis: Case Study of Price Predictions of Lands

Authors: Julio Albuja, David Zaldumbide

Abstract:

Data analysis is an important step before taking a decision about money. The aim of this work is to analyze the factors that influence the final price of the houses through data mining algorithms. To our best knowledge, previous work was researched just to compare results. Furthermore, before using the data of the data set, the Z-Transformation were used to standardize the data in the same range. Hence, the data was classified into two groups to visualize them in a readability format. A decision tree was built, and graphical data is displayed where clearly is easy to see the results and the factors' influence in these graphics. The definitions of these methods are described, as well as the descriptions of the results. Finally, conclusions and recommendations are presented related to the released results that our research showed making it easier to apply these algorithms using a customized data set.

Keywords: algorithms, data, decision tree, transformation

Procedia PDF Downloads 344
24274 Fault Tolerant and Testable Designs of Reversible Sequential Building Blocks

Authors: Vishal Pareek, Shubham Gupta, Sushil Chandra Jain

Abstract:

With increasing high-speed computation demand the power consumption, heat dissipation and chip size issues are posing challenges for logic design with conventional technologies. Recovery of bit loss and bit errors is other issues that require reversibility and fault tolerance in the computation. The reversible computing is emerging as an alternative to conventional technologies to overcome the above problems and helpful in a diverse area such as low-power design, nanotechnology, quantum computing. Bit loss issue can be solved through unique input-output mapping which require reversibility and bit error issue require the capability of fault tolerance in design. In order to incorporate reversibility a number of combinational reversible logic based circuits have been developed. However, very few sequential reversible circuits have been reported in the literature. To make the circuit fault tolerant, a number of fault model and test approaches have been proposed for reversible logic. In this paper, we have attempted to incorporate fault tolerance in sequential reversible building blocks such as D flip-flop, T flip-flop, JK flip-flop, R-S flip-flop, Master-Slave D flip-flop, and double edge triggered D flip-flop by making them parity preserving. The importance of this proposed work lies in the fact that it provides the design of reversible sequential circuits completely testable for any stuck-at fault and single bit fault. In our opinion our design of reversible building blocks is superior to existing designs in term of quantum cost, hardware complexity, constant input, garbage output, number of gates and design of online testable D flip-flop have been proposed for the first time. We hope our work can be extended for building complex reversible sequential circuits.

Keywords: parity preserving gate, quantum computing, fault tolerance, flip-flop, sequential reversible logic

Procedia PDF Downloads 519
24273 Application of Blockchain Technology in Geological Field

Authors: Mengdi Zhang, Zhenji Gao, Ning Kang, Rongmei Liu

Abstract:

Management and application of geological big data is an important part of China's national big data strategy. With the implementation of a national big data strategy, geological big data management becomes more and more critical. At present, there are still a lot of technology barriers as well as cognition chaos in many aspects of geological big data management and application, such as data sharing, intellectual property protection, and application technology. Therefore, it’s a key task to make better use of new technologies for deeper delving and wider application of geological big data. In this paper, we briefly introduce the basic principle of blockchain technology at the beginning and then make an analysis of the application dilemma of geological data. Based on the current analysis, we bring forward some feasible patterns and scenarios for the blockchain application in geological big data and put forward serval suggestions for future work in geological big data management.

Keywords: blockchain, intellectual property protection, geological data, big data management

Procedia PDF Downloads 52
24272 Detection of Curvilinear Structure via Recursive Anisotropic Diffusion

Authors: Sardorbek Numonov, Hyohun Kim, Dongwha Shin, Yeonseok Kim, Ji-Su Ahn, Dongeun Choi, Byung-Woo Hong

Abstract:

The detection of curvilinear structures often plays an important role in the analysis of images. In particular, it is considered as a crucial step for the diagnosis of chronic respiratory diseases to localize the fissures in chest CT imagery where the lung is divided into five lobes by the fissures that are characterized by linear features in appearance. However, the characteristic linear features for the fissures are often shown to be subtle due to the high intensity variability, pathological deformation or image noise involved in the imaging procedure, which leads to the uncertainty in the quantification of anatomical or functional properties of the lung. Thus, it is desired to enhance the linear features present in the chest CT images so that the distinctiveness in the delineation of the lobe is improved. We propose a recursive diffusion process that prefers coherent features based on the analysis of structure tensor in an anisotropic manner. The local image features associated with certain scales and directions can be characterized by the eigenanalysis of the structure tensor that is often regularized via isotropic diffusion filters. However, the isotropic diffusion filters involved in the computation of the structure tensor generally blur geometrically significant structure of the features leading to the degradation of the characteristic power in the feature space. Thus, it is required to take into consideration of local structure of the feature in scale and direction when computing the structure tensor. We apply an anisotropic diffusion in consideration of scale and direction of the features in the computation of the structure tensor that subsequently provides the geometrical structure of the features by its eigenanalysis that determines the shape of the anisotropic diffusion kernel. The recursive application of the anisotropic diffusion with the kernel the shape of which is derived from the structure tensor leading to the anisotropic scale-space where the geometrical features are preserved via the eigenanalysis of the structure tensor computed from the diffused image. The recursive interaction between the anisotropic diffusion based on the geometry-driven kernels and the computation of the structure tensor that determines the shape of the diffusion kernels yields a scale-space where geometrical properties of the image structure are effectively characterized. We apply our recursive anisotropic diffusion algorithm to the detection of curvilinear structure in the chest CT imagery where the fissures present curvilinear features and define the boundary of lobes. It is shown that our algorithm yields precise detection of the fissures while overcoming the subtlety in defining the characteristic linear features. The quantitative evaluation demonstrates the robustness and effectiveness of the proposed algorithm for the detection of fissures in the chest CT in terms of the false positive and the true positive measures. The receiver operating characteristic curves indicate the potential of our algorithm as a segmentation tool in the clinical environment. This work was supported by the MISP(Ministry of Science and ICT), Korea, under the National Program for Excellence in SW (20170001000011001) supervised by the IITP(Institute for Information and Communications Technology Promotion).

Keywords: anisotropic diffusion, chest CT imagery, chronic respiratory disease, curvilinear structure, fissure detection, structure tensor

Procedia PDF Downloads 208
24271 3D Classification Optimization of Low-Density Airborne Light Detection and Ranging Point Cloud by Parameters Selection

Authors: Baha Eddine Aissou, Aichouche Belhadj Aissa

Abstract:

Light detection and ranging (LiDAR) is an active remote sensing technology used for several applications. Airborne LiDAR is becoming an important technology for the acquisition of a highly accurate dense point cloud. A classification of airborne laser scanning (ALS) point cloud is a very important task that still remains a real challenge for many scientists. Support vector machine (SVM) is one of the most used statistical learning algorithms based on kernels. SVM is a non-parametric method, and it is recommended to be used in cases where the data distribution cannot be well modeled by a standard parametric probability density function. Using a kernel, it performs a robust non-linear classification of samples. Often, the data are rarely linearly separable. SVMs are able to map the data into a higher-dimensional space to become linearly separable, which allows performing all the computations in the original space. This is one of the main reasons that SVMs are well suited for high-dimensional classification problems. Only a few training samples, called support vectors, are required. SVM has also shown its potential to cope with uncertainty in data caused by noise and fluctuation, and it is computationally efficient as compared to several other methods. Such properties are particularly suited for remote sensing classification problems and explain their recent adoption. In this poster, the SVM classification of ALS LiDAR data is proposed. Firstly, connected component analysis is applied for clustering the point cloud. Secondly, the resulting clusters are incorporated in the SVM classifier. Radial basic function (RFB) kernel is used due to the few numbers of parameters (C and γ) that needs to be chosen, which decreases the computation time. In order to optimize the classification rates, the parameters selection is explored. It consists to find the parameters (C and γ) leading to the best overall accuracy using grid search and 5-fold cross-validation. The exploited LiDAR point cloud is provided by the German Society for Photogrammetry, Remote Sensing, and Geoinformation. The ALS data used is characterized by a low density (4-6 points/m²) and is covering an urban area located in residential parts of the city Vaihingen in southern Germany. The class ground and three other classes belonging to roof superstructures are considered, i.e., a total of 4 classes. The training and test sets are selected randomly several times. The obtained results demonstrated that a parameters selection can orient the selection in a restricted interval of (C and γ) that can be further explored but does not systematically lead to the optimal rates. The SVM classifier with hyper-parameters is compared with the most used classifiers in literature for LiDAR data, random forest, AdaBoost, and decision tree. The comparison showed the superiority of the SVM classifier using parameters selection for LiDAR data compared to other classifiers.

Keywords: classification, airborne LiDAR, parameters selection, support vector machine

Procedia PDF Downloads 122
24270 Frequent Item Set Mining for Big Data Using MapReduce Framework

Authors: Tamanna Jethava, Rahul Joshi

Abstract:

Frequent Item sets play an essential role in many data Mining tasks that try to find interesting patterns from the database. Typically it refers to a set of items that frequently appear together in transaction dataset. There are several mining algorithm being used for frequent item set mining, yet most do not scale to the type of data we presented with today, so called “BIG DATA”. Big Data is a collection of large data sets. Our approach is to work on the frequent item set mining over the large dataset with scalable and speedy way. Big Data basically works with Map Reduce along with HDFS is used to find out frequent item sets from Big Data on large cluster. This paper focuses on using pre-processing & mining algorithm as hybrid approach for big data over Hadoop platform.

Keywords: frequent item set mining, big data, Hadoop, MapReduce

Procedia PDF Downloads 389
24269 The Role Of Data Gathering In NGOs

Authors: Hussaini Garba Mohammed

Abstract:

Background/Significance: The lack of data gathering is affecting NGOs world-wide in general to have good data information about educational and health related issues among communities in any country and around the world. For example, HIV/AIDS smoking (Tuberculosis diseases) and COVID-19 virus carriers is becoming a serious public health problem, especially among old men and women. But there is no full details data survey assessment from communities, villages, and rural area in some countries to show the percentage of victims and patients, especial with this world COVID-19 virus among the people. These data are essential to inform programming targets, strategies, and priorities in getting good information about data gathering in any society.

Keywords: reliable information, data assessment, data mining, data communication

Procedia PDF Downloads 155
24268 Bayesian System and Copula for Event Detection and Summarization of Soccer Videos

Authors: Dhanuja S. Patil, Sanjay B. Waykar

Abstract:

Event detection is a standout amongst the most key parts for distinctive sorts of area applications of video data framework. Recently, it has picked up an extensive interest of experts and in scholastics from different zones. While detecting video event has been the subject of broad study efforts recently, impressively less existing methodology has considered multi-model data and issues related efficiency. Start of soccer matches different doubtful circumstances rise that can't be effectively judged by the referee committee. A framework that checks objectively image arrangements would prevent not right interpretations because of some errors, or high velocity of the events. Bayesian networks give a structure for dealing with this vulnerability using an essential graphical structure likewise the probability analytics. We propose an efficient structure for analysing and summarization of soccer videos utilizing object-based features. The proposed work utilizes the t-cherry junction tree, an exceptionally recent advancement in probabilistic graphical models, to create a compact representation and great approximation intractable model for client’s relationships in an interpersonal organization. There are various advantages in this approach firstly; the t-cherry gives best approximation by means of junction trees class. Secondly, to construct a t-cherry junction tree can be to a great extent parallelized; and at last inference can be performed utilizing distributed computation. Examination results demonstrates the effectiveness, adequacy, and the strength of the proposed work which is shown over a far reaching information set, comprising more soccer feature, caught at better places.

Keywords: summarization, detection, Bayesian network, t-cherry tree

Procedia PDF Downloads 294
24267 The Application of Data Mining Technology in Building Energy Consumption Data Analysis

Authors: Liang Zhao, Jili Zhang, Chongquan Zhong

Abstract:

Energy consumption data, in particular those involving public buildings, are impacted by many factors: the building structure, climate/environmental parameters, construction, system operating condition, and user behavior patterns. Traditional methods for data analysis are insufficient. This paper delves into the data mining technology to determine its application in the analysis of building energy consumption data including energy consumption prediction, fault diagnosis, and optimal operation. Recent literature are reviewed and summarized, the problems faced by data mining technology in the area of energy consumption data analysis are enumerated, and research points for future studies are given.

Keywords: data mining, data analysis, prediction, optimization, building operational performance

Procedia PDF Downloads 815
24266 To Handle Data-Driven Software Development Projects Effectively

Authors: Shahnewaz Khan

Abstract:

Machine learning (ML) techniques are often used in projects for creating data-driven applications. These tasks typically demand additional research and analysis. The proper technique and strategy must be chosen to ensure the success of data-driven projects. Otherwise, even exerting a lot of effort, the necessary development might not always be possible. In this post, an effort to examine the workflow of data-driven software development projects and its implementation process in order to describe how to manage a project successfully. Which will assist in minimizing the added workload.

Keywords: data, data-driven projects, data science, NLP, software project

Procedia PDF Downloads 54
24265 The Relationship Between Artificial Intelligence, Data Science, and Privacy

Authors: M. Naidoo

Abstract:

Artificial intelligence often requires large amounts of good quality data. Within important fields, such as healthcare, the training of AI systems predominately relies on health and personal data; however, the usage of this data is complicated by various layers of law and ethics that seek to protect individuals’ privacy rights. This research seeks to establish the challenges AI and data sciences pose to (i) informational rights, (ii) privacy rights, and (iii) data protection. To solve some of the issues presented, various methods are suggested, such as embedding values in technological development, proper balancing of rights and interests, and others.

Keywords: artificial intelligence, data science, law, policy

Procedia PDF Downloads 81
24264 A Parallel Implementation of k-Means in MATLAB

Authors: Dimitris Varsamis, Christos Talagkozis, Alkiviadis Tsimpiris, Paris Mastorocostas

Abstract:

The aim of this work is the parallel implementation of k-means in MATLAB, in order to reduce the execution time. Specifically, a new function in MATLAB for serial k-means algorithm is developed, which meets all the requirements for the conversion to a function in MATLAB with parallel computations. Additionally, two different variants for the definition of initial values are presented. In the sequel, the parallel approach is presented. Finally, the performance tests for the computation times respect to the numbers of features and classes are illustrated.

Keywords: K-means algorithm, clustering, parallel computations, Matlab

Procedia PDF Downloads 345
24263 Simulation Data Summarization Based on Spatial Histograms

Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura

Abstract:

In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.

Keywords: simulation data, data summarization, spatial histograms, exploration, visualization

Procedia PDF Downloads 155
24262 Numerical Investigation of the Effect of Blast Pressure on Discrete Model in Shock Tube

Authors: Aldin Justin Sundararaj, Austin Lord Tennyson, Divya Jose, A. N. Subash

Abstract:

Blast waves are generated due to the explosions of high energy materials. An explosion yielding a blast wave has the potential to cause severe damage to buildings and its personnel. In order to understand the physics of effects of blast pressure on buildings, studies in the shock tube on generic configurations are carried out at various pressures on discrete models. The strength of shock wave is systematically varied by using different driver gases and diaphragm thickness. The basic material of the diaphragm is Aluminum. To simulate the effect of shock waves on discrete models a shock tube was used. Generic models selected for this study are suitably scaled cylinder, cone and cubical blocks. The experiments were carried out with 2mm diaphragm with burst pressure ranging from 28 to 31 bar. Numerical analysis was carried out over these discrete models. A 3D model of shock-tube with different discrete models inside the tube was used for CFD computation. It was found that cone has dissipated most of the shock pressure compared to cylinder and cubical block. The robustness and the accuracy of the numerical model were validation with the analytical and experimental data.

Keywords: shock wave, blast wave, discrete models, shock tube

Procedia PDF Downloads 288