Search results for: big data computation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24469

Search results for: big data computation

24439 Parallel Evaluation of Sommerfeld Integrals for Multilayer Dyadic Green's Function

Authors: Duygu Kan, Mehmet Cayoren

Abstract:

Sommerfeld-integrals (SIs) are commonly encountered in electromagnetics problems involving analysis of antennas and scatterers embedded in planar multilayered media. Generally speaking, the analytical solution of SIs is unavailable, and it is well known that numerical evaluation of SIs is very time consuming and computationally expensive due to the highly oscillating and slowly decaying nature of the integrands. Therefore, fast computation of SIs has a paramount importance. In this paper, a parallel code has been developed to speed up the computation of SI in the framework of calculation of dyadic Green’s function in multilayered media. OpenMP shared memory approach is used to parallelize the SI algorithm and resulted in significant time savings. Moreover accelerating the computation of dyadic Green’s function is discussed based on the parallel SI algorithm developed.

Keywords: Sommerfeld-integrals, multilayer dyadic Green’s function, OpenMP, shared memory parallel programming

Procedia PDF Downloads 213
24438 Exploiting Non-Uniform Utility of Computing: A Case Study

Authors: Arnab Sarkar, Michael Huang, Chuang Ren, Jun Li

Abstract:

The increasing importance of computing in modern society has brought substantial growth in the demand for more computational power. In some problem domains such as scientific simulations, available computational power still sets a limit on what can be practically explored in computation. For many types of code, there is non-uniformity in the utility of computation. That is not every piece of computation contributes equally to the quality of the result. If this non-uniformity is understood well and exploited effectively, we can much more effectively utilize available computing power. In this paper, we discuss a case study of exploring such non-uniformity in a particle-in-cell simulation platform. We find both the existence of significant non-uniformity and that it is generally straightforward to exploit it. We show the potential of order-of-magnitude effective performance gain while keeping the comparable quality of output. We also discuss some challenges in both the practical application of the idea and evaluation of its impact.

Keywords: approximate computing, landau damping, non uniform utility computing, particle-in-cell

Procedia PDF Downloads 228
24437 Discussion on Big Data and One of Its Early Training Application

Authors: Fulya Gokalp Yavuz, Mark Daniel Ward

Abstract:

This study focuses on a contemporary and inevitable topic of Data Science and its exemplary application for early career building: Big Data and Leaving Learning Community (LLC). ‘Academia’ and ‘Industry’ have a common sense on the importance of Big Data. However, both of them are in a threat of missing the training on this interdisciplinary area. Some traditional teaching doctrines are far away being effective on Data Science. Practitioners needs some intuition and real-life examples how to apply new methods to data in size of terabytes. We simply explain the scope of Data Science training and exemplified its early stage application with LLC, which is a National Science Foundation (NSF) founded project under the supervision of Prof. Ward since 2014. Essentially, we aim to give some intuition for professors, researchers and practitioners to combine data science tools for comprehensive real-life examples with the guides of mentees’ feedback. As a result of discussing mentoring methods and computational challenges of Big Data, we intend to underline its potential with some more realization.

Keywords: Big Data, computation, mentoring, training

Procedia PDF Downloads 327
24436 A CORDIC Based Design Technique for Efficient Computation of DCT

Authors: Deboraj Muchahary, Amlan Deep Borah Abir J. Mondal, Alak Majumder

Abstract:

A discrete cosine transform (DCT) is described and a technique to compute it using fast Fourier transform (FFT) is developed. In this work, DCT of a finite length sequence is obtained by incorporating CORDIC methodology in radix-2 FFT algorithm. The proposed methodology is simple to comprehend and maintains a regular structure, thereby reducing computational complexity. DCTs are used extensively in the area of digital processing for the purpose of pattern recognition. So the efficient computation of DCT maintaining a transparent design flow is highly solicited.

Keywords: DCT, DFT, CORDIC, FFT

Procedia PDF Downloads 444
24435 Numerical Computation of Specific Absorption Rate and Induced Current for Workers Exposed to Static Magnetic Fields of MRI Scanners

Authors: Sherine Farrag

Abstract:

Currently-used MRI scanners in Cairo City possess static magnetic field (SMF) that varies from 0.25 up to 3T. More than half of them possess SMF of 1.5T. The SMF of the magnet determine the diagnostic power of a scanner, but not worker's exposure profile. This research paper presents an approach for numerical computation of induced electric fields and SAR values by estimation of fringe static magnetic fields. Iso-gauss line of MR was mapped and a polynomial function of the 7th degree was generated and tested. Induced current field due to worker motion in the SMF and SAR values for organs and tissues have been calculated. Results illustrate that the computation tool used permits quick accurate MRI iso-gauss mapping and calculation of SAR values which can then be used for assessment of occupational exposure profile of MRI operators.

Keywords: MRI occupational exposure, MRI safety, induced current density, specific absorption rate, static magnetic fields

Procedia PDF Downloads 403
24434 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: data augmentation, mutex task generation, meta-learning, text classification.

Procedia PDF Downloads 64
24433 Computation of Natural Logarithm Using Abstract Chemical Reaction Networks

Authors: Iuliia Zarubiieva, Joyun Tseng, Vishwesh Kulkarni

Abstract:

Recent researches has focused on nucleic acids as a substrate for designing biomolecular circuits for in situ monitoring and control. A common approach is to express them by a set of idealised abstract chemical reaction networks (ACRNs). Here, we present new results on how abstract chemical reactions, viz., catalysis, annihilation and degradation, can be used to implement circuit that accurately computes logarithm function using the method of Arithmetic-Geometric Mean (AGM), which has not been previously used in conjunction with ACRNs.

Keywords: chemical reaction networks, ratio computation, stability, robustness

Procedia PDF Downloads 134
24432 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the model-agnostic meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to an exponential growth of computation, this paper also proposes a key data extraction method that only extract part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: mutex task generation, data augmentation, meta-learning, text classification.

Procedia PDF Downloads 97
24431 Operator Optimization Based on Hardware Architecture Alignment Requirements

Authors: Qingqing Gai, Junxing Shen, Yu Luo

Abstract:

Due to the hardware architecture characteristics, some operators tend to acquire better performance if the input/output tensor dimensions are aligned to a certain minimum granularity, such as convolution and deconvolution commonly used in deep learning. Furthermore, if the requirements are not met, the general strategy is to pad with 0 to satisfy the requirements, potentially leading to the under-utilization of the hardware resources. Therefore, for the convolution and deconvolution whose input and output channels do not meet the minimum granularity alignment, we propose to transfer the W-dimensional data to the C-dimension for computation (W2C) to enable the C-dimension to meet the hardware requirements. This scheme also reduces the number of computations in the W-dimension. Although this scheme substantially increases computation, the operator’s speed can improve significantly. It achieves remarkable speedups on multiple hardware accelerators, including Nvidia Tensor cores, Qualcomm digital signal processors (DSPs), and Huawei neural processing units (NPUs). All you need to do is modify the network structure and rearrange the operator weights offline without retraining. At the same time, for some operators, such as the Reducemax, we observe that transferring the Cdimensional data to the W-dimension(C2W) and replacing the Reducemax with the Maxpool can accomplish acceleration under certain circumstances.

Keywords: convolution, deconvolution, W2C, C2W, alignment, hardware accelerator

Procedia PDF Downloads 72
24430 Analyzing the Factors that Cause Parallel Performance Degradation in Parallel Graph-Based Computations Using Graph500

Authors: Mustafa Elfituri, Jonathan Cook

Abstract:

Recently, graph-based computations have become more important in large-scale scientific computing as they can provide a methodology to model many types of relations between independent objects. They are being actively used in fields as varied as biology, social networks, cybersecurity, and computer networks. At the same time, graph problems have some properties such as irregularity and poor locality that make their performance different than regular applications performance. Therefore, parallelizing graph algorithms is a hard and challenging task. Initial evidence is that standard computer architectures do not perform very well on graph algorithms. Little is known exactly what causes this. The Graph500 benchmark is a representative application for parallel graph-based computations, which have highly irregular data access and are driven more by traversing connected data than by computation. In this paper, we present results from analyzing the performance of various example implementations of Graph500, including a shared memory (OpenMP) version, a distributed (MPI) version, and a hybrid version. We measured and analyzed all the factors that affect its performance in order to identify possible changes that would improve its performance. Results are discussed in relation to what factors contribute to performance degradation.

Keywords: graph computation, graph500 benchmark, parallel architectures, parallel programming, workload characterization.

Procedia PDF Downloads 110
24429 Gaussian Mixture Model Based Identification of Arterial Wall Movement for Computation of Distension Waveform

Authors: Ravindra B. Patil, P. Krishnamoorthy, Shriram Sethuraman

Abstract:

This work proposes a novel Gaussian Mixture Model (GMM) based approach for accurate tracking of the arterial wall and subsequent computation of the distension waveform using Radio Frequency (RF) ultrasound signal. The approach was evaluated on ultrasound RF data acquired using a prototype ultrasound system from an artery mimicking flow phantom. The effectiveness of the proposed algorithm is demonstrated by comparing with existing wall tracking algorithms. The experimental results show that the proposed method provides 20% reduction in the error margin compared to the existing approaches in tracking the arterial wall movement. This approach coupled with ultrasound system can be used to estimate the arterial compliance parameters required for screening of cardiovascular related disorders.

Keywords: distension waveform, Gaussian Mixture Model, RF ultrasound, arterial wall movement

Procedia PDF Downloads 475
24428 GPU Based Real-Time Floating Object Detection System

Authors: Jie Yang, Jian-Min Meng

Abstract:

A GPU-based floating object detection scheme is presented in this paper which is designed for floating mine detection tasks. This system uses contrast and motion information to eliminate as many false positives as possible while avoiding false negatives. The GPU computation platform is deployed to allow detecting objects in real-time. From the experimental results, it is shown that with certain configuration, the GPU-based scheme can speed up the computation up to one thousand times compared to the CPU-based scheme.

Keywords: object detection, GPU, motion estimation, parallel processing

Procedia PDF Downloads 444
24427 Crow Search Algorithm-Based Task Offloading Strategies for Fog Computing Architectures

Authors: Aniket Ganvir, Ritarani Sahu, Suchismita Chinara

Abstract:

The rapid digitization of various aspects of life is leading to the creation of smart IoT ecosystems, where interconnected devices generate significant amounts of valuable data. However, these IoT devices face constraints such as limited computational resources and bandwidth. Cloud computing emerges as a solution by offering ample resources for offloading tasks efficiently despite introducing latency issues, especially for time-sensitive applications like fog computing. Fog computing (FC) addresses latency concerns by bringing computation and storage closer to the network edge, minimizing data travel distance, and enhancing efficiency. Offloading tasks to fog nodes or the cloud can conserve energy and extend IoT device lifespan. The offloading process is intricate, with tasks categorized as full or partial, and its optimization presents an NP-hard problem. Traditional greedy search methods struggle to address the complexity of task offloading efficiently. To overcome this, the efficient crow search algorithm (ECSA) has been proposed as a meta-heuristic optimization algorithm. ECSA aims to effectively optimize computation offloading, providing solutions to this challenging problem.

Keywords: IoT, fog computing, task offloading, efficient crow search algorithm

Procedia PDF Downloads 12
24426 A Time-Reducible Approach to Compute Determinant |I-X|

Authors: Wang Xingbo

Abstract:

Computation of determinant in the form |I-X| is primary and fundamental because it can help to compute many other determinants. This article puts forward a time-reducible approach to compute determinant |I-X|. The approach is derived from the Newton’s identity and its time complexity is no more than that to compute the eigenvalues of the square matrix X. Mathematical deductions and numerical example are presented in detail for the approach. By comparison with classical approaches the new approach is proved to be superior to the classical ones and it can naturally reduce the computational time with the improvement of efficiency to compute eigenvalues of the square matrix.

Keywords: algorithm, determinant, computation, eigenvalue, time complexity

Procedia PDF Downloads 390
24425 Computation of Radiotherapy Treatment Plans Based on CT to ED Conversion Curves

Authors: B. Petrović, L. Rutonjski, M. Baucal, M. Teodorović, O. Čudić, B. Basarić

Abstract:

Radiotherapy treatment planning computers use CT data of the patient. For the computation of a treatment plan, treatment planning system must have an information on electron densities of tissues scanned by CT. This information is given by the conversion curve CT (CT number) to ED (electron density), or simply calibration curve. Every treatment planning system (TPS) has built in default CT to ED conversion curves, for the CTs of different manufacturers. However, it is always recommended to verify the CT to ED conversion curve before actual clinical use. Objective of this study was to check how the default curve already provided matches the curve actually measured on a specific CT, and how much it influences the calculation of a treatment planning computer. The examined CT scanners were from the same manufacturer, but four different scanners from three generations. The measurements of all calibration curves were done with the dedicated phantom CIRS 062M Electron Density Phantom. The phantom was scanned, and according to real HU values read at the CT console computer, CT to ED conversion curves were generated for different materials, for same tube voltage 140 kV. Another phantom, CIRS Thorax 002 LFC which represents an average human torso in proportion, density and two-dimensional structure, was used for verification. The treatment planning was done on CT slices of scanned CIRS LFC 002 phantom, for selected cases. Interest points were set in the lungs, and in the spinal cord, and doses recorded in TPS. The overall calculated treatment times for four scanners and default scanner did not differ more than 0.8%. Overall interest point dose in bone differed max 0.6% while for single fields was maximum 2.7% (lateral field). Overall interest point dose in lungs differed max 1.1% while for single fields was maximum 2.6% (lateral field). It is known that user should verify the CT to ED conversion curve, but often, developing countries are facing lack of QA equipment, and often use default data provided. We have concluded that the CT to ED curves obtained differ in certain points of a curve, generally in the region of higher densities. This influences the treatment planning result which is not significant, but definitely does make difference in the calculated dose.

Keywords: Computation of treatment plan, conversion curve, radiotherapy, electron density

Procedia PDF Downloads 450
24424 Study and Analysis of the Factors Affecting Road Safety Using Decision Tree Algorithms

Authors: Naina Mahajan, Bikram Pal Kaur

Abstract:

The purpose of traffic accident analysis is to find the possible causes of an accident. Road accidents cannot be totally prevented but by suitable traffic engineering and management the accident rate can be reduced to a certain extent. This paper discusses the classification techniques C4.5 and ID3 using the WEKA Data mining tool. These techniques use on the NH (National highway) dataset. With the C4.5 and ID3 technique it gives best results and high accuracy with less computation time and error rate.

Keywords: C4.5, ID3, NH(National highway), WEKA data mining tool

Procedia PDF Downloads 305
24423 A Unified Webcam Proctoring Solution on Edge

Authors: Saw Thiha, Jay Rajasekera

Abstract:

A boom in video conferencing generated millions of hours of video data daily to be analyzed. However, such enormous data pose certain scalability issues to be analyzed efficiently, let alone do it in real-time, as online conferences can involve hundreds of people and can last for hours. This paper proposes an efficient online proctoring solution that can analyze the online conferences real-time on edge devices such as Android, iOS, and desktops. Since the computation can be done upfront on the devices where online conferences take place, it can scale well without requiring intensive resources such as GPU servers and complex cloud infrastructure. According to the linear models, face orientation does indeed impact the perceived eye openness. Also, the proposed z score facial landmark standardization was proven to be functional in detecting face orientation and contributed to classifying eye blinks with single eyelid distance computation while achieving a better f1 score and accuracy than the Eye Aspect Ratio (EAR) threshold method. Last but not least, the authors implemented the solution natively in the MediaPipe framework and open-sourced it along with the reproducible experimental results on GitHub. The solution provides face orientation, eye blink, facial activity, and translation detections out of the box and is highly customizable and extensible.

Keywords: android, desktop, edge computing, blink, face orientation, facial activity and translation, MediaPipe, open source, real-time, video conference, web, iOS, Z score facial landmark standardization

Procedia PDF Downloads 71
24422 Data Hiding by Vector Quantization in Color Image

Authors: Yung Gi Wu

Abstract:

With the growing of computer and network, digital data can be spread to anywhere in the world quickly. In addition, digital data can also be copied or tampered easily so that the security issue becomes an important topic in the protection of digital data. Digital watermark is a method to protect the ownership of digital data. Embedding the watermark will influence the quality certainly. In this paper, Vector Quantization (VQ) is used to embed the watermark into the image to fulfill the goal of data hiding. This kind of watermarking is invisible which means that the users will not conscious the existing of embedded watermark even though the embedded image has tiny difference compared to the original image. Meanwhile, VQ needs a lot of computation burden so that we adopt a fast VQ encoding scheme by partial distortion searching (PDS) and mean approximation scheme to speed up the data hiding process. The watermarks we hide to the image could be gray, bi-level and color images. Texts are also can be regarded as watermark to embed. In order to test the robustness of the system, we adopt Photoshop to fulfill sharpen, cropping and altering to check if the extracted watermark is still recognizable. Experimental results demonstrate that the proposed system can resist the above three kinds of tampering in general cases.

Keywords: data hiding, vector quantization, watermark, color image

Procedia PDF Downloads 333
24421 Speed up Vector Median Filtering by Quasi Euclidean Norm

Authors: Vinai K. Singh

Abstract:

For reducing impulsive noise without degrading image contours, median filtering is a powerful tool. In multiband images as for example colour images or vector fields obtained by optic flow computation, a vector median filter can be used. Vector median filters are defined on the basis of a suitable distance, the best performing distance being the Euclidean. Euclidean distance is evaluated by using the Euclidean norms which is quite demanding from the point of view of computation given that a square root is required. In this paper an optimal piece-wise linear approximation of the Euclidean norm is presented which is applied to vector median filtering.

Keywords: euclidean norm, quasi euclidean norm, vector median filtering, applied mathematics

Procedia PDF Downloads 436
24420 Cloud-Based Mobile-to-Mobile Computation Offloading

Authors: Ebrahim Alrashed, Yousef Rafique

Abstract:

Mobile devices have drastically changed the way we do things on the move. They are being extremely relied on to perform tasks that are analogous to desktop computer capability. There has been a rapid increase of computational power on these devices; however, battery technology is still the bottleneck of evolution. The primary modern approach day approach to tackle this issue is offloading computation to the cloud, proving to be latency expensive and requiring high network bandwidth. In this paper, we explore efforts to perform barter-based mobile-to-mobile offloading. We present define a protocol and present an architecture to facilitate the development of such a system. We further highlight the deployment and security challenges.

Keywords: computational offloading, power conservation, cloud, sandboxing

Procedia PDF Downloads 362
24419 Collision Detection Algorithm Based on Data Parallelism

Authors: Zhen Peng, Baifeng Wu

Abstract:

Modern computing technology enters the era of parallel computing with the trend of sustainable and scalable parallelism. Single Instruction Multiple Data (SIMD) is an important way to go along with the trend. It is able to gather more and more computing ability by increasing the number of processor cores without the need of modifying the program. Meanwhile, in the field of scientific computing and engineering design, many computation intensive applications are facing the challenge of increasingly large amount of data. Data parallel computing will be an important way to further improve the performance of these applications. In this paper, we take the accurate collision detection in building information modeling as an example. We demonstrate a model for constructing a data parallel algorithm. According to the model, a complex object is decomposed into the sets of simple objects; collision detection among complex objects is converted into those among simple objects. The resulting algorithm is a typical SIMD algorithm, and its advantages in parallelism and scalability is unparalleled in respect to the traditional algorithms.

Keywords: data parallelism, collision detection, single instruction multiple data, building information modeling, continuous scalability

Procedia PDF Downloads 257
24418 Problems of Boolean Reasoning Based Biclustering Parallelization

Authors: Marcin Michalak

Abstract:

Biclustering is the way of two-dimensional data analysis. For several years it became possible to express such issue in terms of Boolean reasoning, for processing continuous, discrete and binary data. The mathematical backgrounds of such approach — proved ability of induction of exact and inclusion–maximal biclusters fulfilling assumed criteria — are strong advantages of the method. Unfortunately, the core of the method has quite high computational complexity. In the paper the basics of Boolean reasoning approach for biclustering are presented. In such context the problems of computation parallelization are risen.

Keywords: Boolean reasoning, biclustering, parallelization, prime implicant

Procedia PDF Downloads 92
24417 Algorithms for Fast Computation of Pan Matrix Profiles of Time Series Under Unnormalized Euclidean Distances

Authors: Jing Zhang, Daniel Nikovski

Abstract:

We propose an approximation algorithm called LINKUMP to compute the Pan Matrix Profile (PMP) under the unnormalized l∞ distance (useful for value-based similarity search) using double-ended queue and linear interpolation. The algorithm has comparable time/space complexities as the state-of-the-art algorithm for typical PMP computation under the normalized l₂ distance (useful for shape-based similarity search). We validate its efficiency and effectiveness through extensive numerical experiments and a real-world anomaly detection application.

Keywords: pan matrix profile, unnormalized euclidean distance, double-ended queue, discord discovery, anomaly detection

Procedia PDF Downloads 213
24416 Symbolic Computation for the Multi-Soliton Solutions of a Class of Fifth-Order Evolution Equations

Authors: Rafat Alshorman, Fadi Awawdeh

Abstract:

By employing a simplified bilinear method, a class of generalized fifth-order KdV (gfKdV) equations which arise in nonlinear lattice, plasma physics and ocean dynamics are investigated. With the aid of symbolic computation, both solitary wave solutions and multiple-soliton solutions are obtained. These new exact solutions will extend previous results and help us explain the properties of nonlinear solitary waves in many physical models in shallow water. Parametric analysis is carried out in order to illustrate that the soliton amplitude, width and velocity are affected by the coefficient parameters in the equation.

Keywords: multiple soliton solutions, fifth-order evolution equations, Cole-Hopf transformation, Hirota bilinear method

Procedia PDF Downloads 293
24415 A Dynamic Ensemble Learning Approach for Online Anomaly Detection in Alibaba Datacenters

Authors: Wanyi Zhu, Xia Ming, Huafeng Wang, Junda Chen, Lu Liu, Jiangwei Jiang, Guohua Liu

Abstract:

Anomaly detection is a first and imperative step needed to respond to unexpected problems and to assure high performance and security in large data center management. This paper presents an online anomaly detection system through an innovative approach of ensemble machine learning and adaptive differentiation algorithms, and applies them to performance data collected from a continuous monitoring system for multi-tier web applications running in Alibaba data centers. We evaluate the effectiveness and efficiency of this algorithm with production traffic data and compare with the traditional anomaly detection approaches such as a static threshold and other deviation-based detection techniques. The experiment results show that our algorithm correctly identifies the unexpected performance variances of any running application, with an acceptable false positive rate. This proposed approach has already been deployed in real-time production environments to enhance the efficiency and stability in daily data center operations.

Keywords: Alibaba data centers, anomaly detection, big data computation, dynamic ensemble learning

Procedia PDF Downloads 169
24414 HPPDFIM-HD: Transaction Distortion and Connected Perturbation Approach for Hierarchical Privacy Preserving Distributed Frequent Itemset Mining over Horizontally-Partitioned Dataset

Authors: Fuad Ali Mohammed Al-Yarimi

Abstract:

Many algorithms have been proposed to provide privacy preserving in data mining. These protocols are based on two main approaches named as: the perturbation approach and the Cryptographic approach. The first one is based on perturbation of the valuable information while the second one uses cryptographic techniques. The perturbation approach is much more efficient with reduced accuracy while the cryptographic approach can provide solutions with perfect accuracy. However, the cryptographic approach is a much slower method and requires considerable computation and communication overhead. In this paper, a new scalable protocol is proposed which combines the advantages of the perturbation and distortion along with cryptographic approach to perform privacy preserving in distributed frequent itemset mining on horizontally distributed data. Both the privacy and performance characteristics of the proposed protocol are studied empirically.

Keywords: anonymity data, data mining, distributed frequent itemset mining, gaussian perturbation, perturbation approach, privacy preserving data mining

Procedia PDF Downloads 476
24413 An Interpretable Data-Driven Approach for the Stratification of the Cardiorespiratory Fitness

Authors: D.Mendes, J. Henriques, P. Carvalho, T. Rocha, S. Paredes, R. Cabiddu, R. Trimer, R. Mendes, A. Borghi-Silva, L. Kaminsky, E. Ashley, R. Arena, J. Myers

Abstract:

The continued exploration of clinically relevant predictive models continues to be an important pursuit. Cardiorespiratory fitness (CRF) portends clinical vital information and as such its accurate prediction is of high importance. Therefore, the aim of the current study was to develop a data-driven model, based on computational intelligence techniques and, in particular, clustering approaches, to predict CRF. Two prediction models were implemented and compared: 1) the traditional Wasserman/Hansen Equations; and 2) an interpretable clustering approach. Data used for this analysis were from the 'FRIEND - Fitness Registry and the Importance of Exercise: The National Data Base'; in the present study a subset of 10690 apparently healthy individuals were utilized. The accuracy of the models was performed through the computation of sensitivity, specificity, and geometric mean values. The results show the superiority of the clustering approach in the accurate estimation of CRF (i.e., maximal oxygen consumption).

Keywords: cardiorespiratory fitness, data-driven models, knowledge extraction, machine learning

Procedia PDF Downloads 256
24412 An Online Adaptive Thresholding Method to Classify Google Trends Data Anomalies for Investor Sentiment Analysis

Authors: Duygu Dere, Mert Ergeneci, Kaan Gokcesu

Abstract:

Google Trends data has gained increasing popularity in the applications of behavioral finance, decision science and risk management. Because of Google’s wide range of use, the Trends statistics provide significant information about the investor sentiment and intention, which can be used as decisive factors for corporate and risk management fields. However, an anomaly, a significant increase or decrease, in a certain query cannot be detected by the state of the art applications of computation due to the random baseline noise of the Trends data, which is modelled as an Additive white Gaussian noise (AWGN). Since through time, the baseline noise power shows a gradual change an adaptive thresholding method is required to track and learn the baseline noise for a correct classification. To this end, we introduce an online method to classify meaningful deviations in Google Trends data. Through extensive experiments, we demonstrate that our method can successfully classify various anomalies for plenty of different data.

Keywords: adaptive data processing, behavioral finance , convex optimization, online learning, soft minimum thresholding

Procedia PDF Downloads 134
24411 Functional and Efficient Query Interpreters: Principle, Application and Performances’ Comparison

Authors: Laurent Thiry, Michel Hassenforder

Abstract:

This paper presents a general approach to implement efficient queries’ interpreters in a functional programming language. Indeed, most of the standard tools actually available use an imperative and/or object-oriented language for the implementation (e.g. Java for Jena-Fuseki) but other paradigms are possible with, maybe, better performances. To proceed, the paper first explains how to model data structures and queries in a functional point of view. Then, it proposes a general methodology to get performances (i.e. number of computation steps to answer a query) then it explains how to integrate some optimization techniques (short-cut fusion and, more important, data transformations). It then compares the functional server proposed to a standard tool (Fuseki) demonstrating that the first one can be twice to ten times faster to answer queries.

Keywords: data transformation, functional programming, information server, optimization

Procedia PDF Downloads 126
24410 Brain Age Prediction Based on Brain Magnetic Resonance Imaging by 3D Convolutional Neural Network

Authors: Leila Keshavarz Afshar, Hedieh Sajedi

Abstract:

Estimation of biological brain age from MR images is a topic that has been much addressed in recent years due to the importance it attaches to early diagnosis of diseases such as Alzheimer's. In this paper, we use a 3D Convolutional Neural Network (CNN) to provide a method for estimating the biological age of the brain. The 3D-CNN model is trained by MRI data that has been normalized. In addition, to reduce computation while saving overall performance, some effectual slices are selected for age estimation. By this method, the biological age of individuals using selected normalized data was estimated with Mean Absolute Error (MAE) of 4.82 years.

Keywords: brain age estimation, biological age, 3D-CNN, deep learning, T1-weighted image, SPM, preprocessing, MRI, canny, gray matter

Procedia PDF Downloads 117