Search results for: big data computation
24581 Parallel Evaluation of Sommerfeld Integrals for Multilayer Dyadic Green's Function
Authors: Duygu Kan, Mehmet Cayoren
Abstract:
Sommerfeld-integrals (SIs) are commonly encountered in electromagnetics problems involving analysis of antennas and scatterers embedded in planar multilayered media. Generally speaking, the analytical solution of SIs is unavailable, and it is well known that numerical evaluation of SIs is very time consuming and computationally expensive due to the highly oscillating and slowly decaying nature of the integrands. Therefore, fast computation of SIs has a paramount importance. In this paper, a parallel code has been developed to speed up the computation of SI in the framework of calculation of dyadic Green’s function in multilayered media. OpenMP shared memory approach is used to parallelize the SI algorithm and resulted in significant time savings. Moreover accelerating the computation of dyadic Green’s function is discussed based on the parallel SI algorithm developed.Keywords: Sommerfeld-integrals, multilayer dyadic Green’s function, OpenMP, shared memory parallel programming
Procedia PDF Downloads 21724580 Exploiting Non-Uniform Utility of Computing: A Case Study
Authors: Arnab Sarkar, Michael Huang, Chuang Ren, Jun Li
Abstract:
The increasing importance of computing in modern society has brought substantial growth in the demand for more computational power. In some problem domains such as scientific simulations, available computational power still sets a limit on what can be practically explored in computation. For many types of code, there is non-uniformity in the utility of computation. That is not every piece of computation contributes equally to the quality of the result. If this non-uniformity is understood well and exploited effectively, we can much more effectively utilize available computing power. In this paper, we discuss a case study of exploring such non-uniformity in a particle-in-cell simulation platform. We find both the existence of significant non-uniformity and that it is generally straightforward to exploit it. We show the potential of order-of-magnitude effective performance gain while keeping the comparable quality of output. We also discuss some challenges in both the practical application of the idea and evaluation of its impact.Keywords: approximate computing, landau damping, non uniform utility computing, particle-in-cell
Procedia PDF Downloads 23224579 Discussion on Big Data and One of Its Early Training Application
Authors: Fulya Gokalp Yavuz, Mark Daniel Ward
Abstract:
This study focuses on a contemporary and inevitable topic of Data Science and its exemplary application for early career building: Big Data and Leaving Learning Community (LLC). ‘Academia’ and ‘Industry’ have a common sense on the importance of Big Data. However, both of them are in a threat of missing the training on this interdisciplinary area. Some traditional teaching doctrines are far away being effective on Data Science. Practitioners needs some intuition and real-life examples how to apply new methods to data in size of terabytes. We simply explain the scope of Data Science training and exemplified its early stage application with LLC, which is a National Science Foundation (NSF) founded project under the supervision of Prof. Ward since 2014. Essentially, we aim to give some intuition for professors, researchers and practitioners to combine data science tools for comprehensive real-life examples with the guides of mentees’ feedback. As a result of discussing mentoring methods and computational challenges of Big Data, we intend to underline its potential with some more realization.Keywords: Big Data, computation, mentoring, training
Procedia PDF Downloads 33124578 A CORDIC Based Design Technique for Efficient Computation of DCT
Authors: Deboraj Muchahary, Amlan Deep Borah Abir J. Mondal, Alak Majumder
Abstract:
A discrete cosine transform (DCT) is described and a technique to compute it using fast Fourier transform (FFT) is developed. In this work, DCT of a finite length sequence is obtained by incorporating CORDIC methodology in radix-2 FFT algorithm. The proposed methodology is simple to comprehend and maintains a regular structure, thereby reducing computational complexity. DCTs are used extensively in the area of digital processing for the purpose of pattern recognition. So the efficient computation of DCT maintaining a transparent design flow is highly solicited.Keywords: DCT, DFT, CORDIC, FFT
Procedia PDF Downloads 44924577 Numerical Computation of Specific Absorption Rate and Induced Current for Workers Exposed to Static Magnetic Fields of MRI Scanners
Authors: Sherine Farrag
Abstract:
Currently-used MRI scanners in Cairo City possess static magnetic field (SMF) that varies from 0.25 up to 3T. More than half of them possess SMF of 1.5T. The SMF of the magnet determine the diagnostic power of a scanner, but not worker's exposure profile. This research paper presents an approach for numerical computation of induced electric fields and SAR values by estimation of fringe static magnetic fields. Iso-gauss line of MR was mapped and a polynomial function of the 7th degree was generated and tested. Induced current field due to worker motion in the SMF and SAR values for organs and tissues have been calculated. Results illustrate that the computation tool used permits quick accurate MRI iso-gauss mapping and calculation of SAR values which can then be used for assessment of occupational exposure profile of MRI operators.Keywords: MRI occupational exposure, MRI safety, induced current density, specific absorption rate, static magnetic fields
Procedia PDF Downloads 40624576 Computation of Natural Logarithm Using Abstract Chemical Reaction Networks
Authors: Iuliia Zarubiieva, Joyun Tseng, Vishwesh Kulkarni
Abstract:
Recent researches has focused on nucleic acids as a substrate for designing biomolecular circuits for in situ monitoring and control. A common approach is to express them by a set of idealised abstract chemical reaction networks (ACRNs). Here, we present new results on how abstract chemical reactions, viz., catalysis, annihilation and degradation, can be used to implement circuit that accurately computes logarithm function using the method of Arithmetic-Geometric Mean (AGM), which has not been previously used in conjunction with ACRNs.Keywords: chemical reaction networks, ratio computation, stability, robustness
Procedia PDF Downloads 13824575 A Mutually Exclusive Task Generation Method Based on Data Augmentation
Authors: Haojie Wang, Xun Li, Rui Yin
Abstract:
In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.Keywords: data augmentation, mutex task generation, meta-learning, text classification.
Procedia PDF Downloads 7024574 A Mutually Exclusive Task Generation Method Based on Data Augmentation
Authors: Haojie Wang, Xun Li, Rui Yin
Abstract:
In order to solve the memorization overfitting in the model-agnostic meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to an exponential growth of computation, this paper also proposes a key data extraction method that only extract part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.Keywords: mutex task generation, data augmentation, meta-learning, text classification.
Procedia PDF Downloads 10824573 Operator Optimization Based on Hardware Architecture Alignment Requirements
Authors: Qingqing Gai, Junxing Shen, Yu Luo
Abstract:
Due to the hardware architecture characteristics, some operators tend to acquire better performance if the input/output tensor dimensions are aligned to a certain minimum granularity, such as convolution and deconvolution commonly used in deep learning. Furthermore, if the requirements are not met, the general strategy is to pad with 0 to satisfy the requirements, potentially leading to the under-utilization of the hardware resources. Therefore, for the convolution and deconvolution whose input and output channels do not meet the minimum granularity alignment, we propose to transfer the W-dimensional data to the C-dimension for computation (W2C) to enable the C-dimension to meet the hardware requirements. This scheme also reduces the number of computations in the W-dimension. Although this scheme substantially increases computation, the operator’s speed can improve significantly. It achieves remarkable speedups on multiple hardware accelerators, including Nvidia Tensor cores, Qualcomm digital signal processors (DSPs), and Huawei neural processing units (NPUs). All you need to do is modify the network structure and rearrange the operator weights offline without retraining. At the same time, for some operators, such as the Reducemax, we observe that transferring the Cdimensional data to the W-dimension(C2W) and replacing the Reducemax with the Maxpool can accomplish acceleration under certain circumstances.Keywords: convolution, deconvolution, W2C, C2W, alignment, hardware accelerator
Procedia PDF Downloads 7624572 Analyzing the Factors that Cause Parallel Performance Degradation in Parallel Graph-Based Computations Using Graph500
Authors: Mustafa Elfituri, Jonathan Cook
Abstract:
Recently, graph-based computations have become more important in large-scale scientific computing as they can provide a methodology to model many types of relations between independent objects. They are being actively used in fields as varied as biology, social networks, cybersecurity, and computer networks. At the same time, graph problems have some properties such as irregularity and poor locality that make their performance different than regular applications performance. Therefore, parallelizing graph algorithms is a hard and challenging task. Initial evidence is that standard computer architectures do not perform very well on graph algorithms. Little is known exactly what causes this. The Graph500 benchmark is a representative application for parallel graph-based computations, which have highly irregular data access and are driven more by traversing connected data than by computation. In this paper, we present results from analyzing the performance of various example implementations of Graph500, including a shared memory (OpenMP) version, a distributed (MPI) version, and a hybrid version. We measured and analyzed all the factors that affect its performance in order to identify possible changes that would improve its performance. Results are discussed in relation to what factors contribute to performance degradation.Keywords: graph computation, graph500 benchmark, parallel architectures, parallel programming, workload characterization.
Procedia PDF Downloads 11624571 Gaussian Mixture Model Based Identification of Arterial Wall Movement for Computation of Distension Waveform
Authors: Ravindra B. Patil, P. Krishnamoorthy, Shriram Sethuraman
Abstract:
This work proposes a novel Gaussian Mixture Model (GMM) based approach for accurate tracking of the arterial wall and subsequent computation of the distension waveform using Radio Frequency (RF) ultrasound signal. The approach was evaluated on ultrasound RF data acquired using a prototype ultrasound system from an artery mimicking flow phantom. The effectiveness of the proposed algorithm is demonstrated by comparing with existing wall tracking algorithms. The experimental results show that the proposed method provides 20% reduction in the error margin compared to the existing approaches in tracking the arterial wall movement. This approach coupled with ultrasound system can be used to estimate the arterial compliance parameters required for screening of cardiovascular related disorders.Keywords: distension waveform, Gaussian Mixture Model, RF ultrasound, arterial wall movement
Procedia PDF Downloads 48024570 GPU Based Real-Time Floating Object Detection System
Authors: Jie Yang, Jian-Min Meng
Abstract:
A GPU-based floating object detection scheme is presented in this paper which is designed for floating mine detection tasks. This system uses contrast and motion information to eliminate as many false positives as possible while avoiding false negatives. The GPU computation platform is deployed to allow detecting objects in real-time. From the experimental results, it is shown that with certain configuration, the GPU-based scheme can speed up the computation up to one thousand times compared to the CPU-based scheme.Keywords: object detection, GPU, motion estimation, parallel processing
Procedia PDF Downloads 45024569 Crow Search Algorithm-Based Task Offloading Strategies for Fog Computing Architectures
Authors: Aniket Ganvir, Ritarani Sahu, Suchismita Chinara
Abstract:
The rapid digitization of various aspects of life is leading to the creation of smart IoT ecosystems, where interconnected devices generate significant amounts of valuable data. However, these IoT devices face constraints such as limited computational resources and bandwidth. Cloud computing emerges as a solution by offering ample resources for offloading tasks efficiently despite introducing latency issues, especially for time-sensitive applications like fog computing. Fog computing (FC) addresses latency concerns by bringing computation and storage closer to the network edge, minimizing data travel distance, and enhancing efficiency. Offloading tasks to fog nodes or the cloud can conserve energy and extend IoT device lifespan. The offloading process is intricate, with tasks categorized as full or partial, and its optimization presents an NP-hard problem. Traditional greedy search methods struggle to address the complexity of task offloading efficiently. To overcome this, the efficient crow search algorithm (ECSA) has been proposed as a meta-heuristic optimization algorithm. ECSA aims to effectively optimize computation offloading, providing solutions to this challenging problem.Keywords: IoT, fog computing, task offloading, efficient crow search algorithm
Procedia PDF Downloads 2124568 A Time-Reducible Approach to Compute Determinant |I-X|
Authors: Wang Xingbo
Abstract:
Computation of determinant in the form |I-X| is primary and fundamental because it can help to compute many other determinants. This article puts forward a time-reducible approach to compute determinant |I-X|. The approach is derived from the Newton’s identity and its time complexity is no more than that to compute the eigenvalues of the square matrix X. Mathematical deductions and numerical example are presented in detail for the approach. By comparison with classical approaches the new approach is proved to be superior to the classical ones and it can naturally reduce the computational time with the improvement of efficiency to compute eigenvalues of the square matrix.Keywords: algorithm, determinant, computation, eigenvalue, time complexity
Procedia PDF Downloads 39424567 Computation of Radiotherapy Treatment Plans Based on CT to ED Conversion Curves
Authors: B. Petrović, L. Rutonjski, M. Baucal, M. Teodorović, O. Čudić, B. Basarić
Abstract:
Radiotherapy treatment planning computers use CT data of the patient. For the computation of a treatment plan, treatment planning system must have an information on electron densities of tissues scanned by CT. This information is given by the conversion curve CT (CT number) to ED (electron density), or simply calibration curve. Every treatment planning system (TPS) has built in default CT to ED conversion curves, for the CTs of different manufacturers. However, it is always recommended to verify the CT to ED conversion curve before actual clinical use. Objective of this study was to check how the default curve already provided matches the curve actually measured on a specific CT, and how much it influences the calculation of a treatment planning computer. The examined CT scanners were from the same manufacturer, but four different scanners from three generations. The measurements of all calibration curves were done with the dedicated phantom CIRS 062M Electron Density Phantom. The phantom was scanned, and according to real HU values read at the CT console computer, CT to ED conversion curves were generated for different materials, for same tube voltage 140 kV. Another phantom, CIRS Thorax 002 LFC which represents an average human torso in proportion, density and two-dimensional structure, was used for verification. The treatment planning was done on CT slices of scanned CIRS LFC 002 phantom, for selected cases. Interest points were set in the lungs, and in the spinal cord, and doses recorded in TPS. The overall calculated treatment times for four scanners and default scanner did not differ more than 0.8%. Overall interest point dose in bone differed max 0.6% while for single fields was maximum 2.7% (lateral field). Overall interest point dose in lungs differed max 1.1% while for single fields was maximum 2.6% (lateral field). It is known that user should verify the CT to ED conversion curve, but often, developing countries are facing lack of QA equipment, and often use default data provided. We have concluded that the CT to ED curves obtained differ in certain points of a curve, generally in the region of higher densities. This influences the treatment planning result which is not significant, but definitely does make difference in the calculated dose.Keywords: Computation of treatment plan, conversion curve, radiotherapy, electron density
Procedia PDF Downloads 45424566 Study and Analysis of the Factors Affecting Road Safety Using Decision Tree Algorithms
Authors: Naina Mahajan, Bikram Pal Kaur
Abstract:
The purpose of traffic accident analysis is to find the possible causes of an accident. Road accidents cannot be totally prevented but by suitable traffic engineering and management the accident rate can be reduced to a certain extent. This paper discusses the classification techniques C4.5 and ID3 using the WEKA Data mining tool. These techniques use on the NH (National highway) dataset. With the C4.5 and ID3 technique it gives best results and high accuracy with less computation time and error rate.Keywords: C4.5, ID3, NH(National highway), WEKA data mining tool
Procedia PDF Downloads 30824565 A Unified Webcam Proctoring Solution on Edge
Authors: Saw Thiha, Jay Rajasekera
Abstract:
A boom in video conferencing generated millions of hours of video data daily to be analyzed. However, such enormous data pose certain scalability issues to be analyzed efficiently, let alone do it in real-time, as online conferences can involve hundreds of people and can last for hours. This paper proposes an efficient online proctoring solution that can analyze the online conferences real-time on edge devices such as Android, iOS, and desktops. Since the computation can be done upfront on the devices where online conferences take place, it can scale well without requiring intensive resources such as GPU servers and complex cloud infrastructure. According to the linear models, face orientation does indeed impact the perceived eye openness. Also, the proposed z score facial landmark standardization was proven to be functional in detecting face orientation and contributed to classifying eye blinks with single eyelid distance computation while achieving a better f1 score and accuracy than the Eye Aspect Ratio (EAR) threshold method. Last but not least, the authors implemented the solution natively in the MediaPipe framework and open-sourced it along with the reproducible experimental results on GitHub. The solution provides face orientation, eye blink, facial activity, and translation detections out of the box and is highly customizable and extensible.Keywords: android, desktop, edge computing, blink, face orientation, facial activity and translation, MediaPipe, open source, real-time, video conference, web, iOS, Z score facial landmark standardization
Procedia PDF Downloads 7524564 Data Hiding by Vector Quantization in Color Image
Authors: Yung Gi Wu
Abstract:
With the growing of computer and network, digital data can be spread to anywhere in the world quickly. In addition, digital data can also be copied or tampered easily so that the security issue becomes an important topic in the protection of digital data. Digital watermark is a method to protect the ownership of digital data. Embedding the watermark will influence the quality certainly. In this paper, Vector Quantization (VQ) is used to embed the watermark into the image to fulfill the goal of data hiding. This kind of watermarking is invisible which means that the users will not conscious the existing of embedded watermark even though the embedded image has tiny difference compared to the original image. Meanwhile, VQ needs a lot of computation burden so that we adopt a fast VQ encoding scheme by partial distortion searching (PDS) and mean approximation scheme to speed up the data hiding process. The watermarks we hide to the image could be gray, bi-level and color images. Texts are also can be regarded as watermark to embed. In order to test the robustness of the system, we adopt Photoshop to fulfill sharpen, cropping and altering to check if the extracted watermark is still recognizable. Experimental results demonstrate that the proposed system can resist the above three kinds of tampering in general cases.Keywords: data hiding, vector quantization, watermark, color image
Procedia PDF Downloads 33824563 Speed up Vector Median Filtering by Quasi Euclidean Norm
Authors: Vinai K. Singh
Abstract:
For reducing impulsive noise without degrading image contours, median filtering is a powerful tool. In multiband images as for example colour images or vector fields obtained by optic flow computation, a vector median filter can be used. Vector median filters are defined on the basis of a suitable distance, the best performing distance being the Euclidean. Euclidean distance is evaluated by using the Euclidean norms which is quite demanding from the point of view of computation given that a square root is required. In this paper an optimal piece-wise linear approximation of the Euclidean norm is presented which is applied to vector median filtering.Keywords: euclidean norm, quasi euclidean norm, vector median filtering, applied mathematics
Procedia PDF Downloads 44024562 Cloud-Based Mobile-to-Mobile Computation Offloading
Authors: Ebrahim Alrashed, Yousef Rafique
Abstract:
Mobile devices have drastically changed the way we do things on the move. They are being extremely relied on to perform tasks that are analogous to desktop computer capability. There has been a rapid increase of computational power on these devices; however, battery technology is still the bottleneck of evolution. The primary modern approach day approach to tackle this issue is offloading computation to the cloud, proving to be latency expensive and requiring high network bandwidth. In this paper, we explore efforts to perform barter-based mobile-to-mobile offloading. We present define a protocol and present an architecture to facilitate the development of such a system. We further highlight the deployment and security challenges.Keywords: computational offloading, power conservation, cloud, sandboxing
Procedia PDF Downloads 36724561 Collision Detection Algorithm Based on Data Parallelism
Authors: Zhen Peng, Baifeng Wu
Abstract:
Modern computing technology enters the era of parallel computing with the trend of sustainable and scalable parallelism. Single Instruction Multiple Data (SIMD) is an important way to go along with the trend. It is able to gather more and more computing ability by increasing the number of processor cores without the need of modifying the program. Meanwhile, in the field of scientific computing and engineering design, many computation intensive applications are facing the challenge of increasingly large amount of data. Data parallel computing will be an important way to further improve the performance of these applications. In this paper, we take the accurate collision detection in building information modeling as an example. We demonstrate a model for constructing a data parallel algorithm. According to the model, a complex object is decomposed into the sets of simple objects; collision detection among complex objects is converted into those among simple objects. The resulting algorithm is a typical SIMD algorithm, and its advantages in parallelism and scalability is unparalleled in respect to the traditional algorithms.Keywords: data parallelism, collision detection, single instruction multiple data, building information modeling, continuous scalability
Procedia PDF Downloads 26124560 Algorithms for Fast Computation of Pan Matrix Profiles of Time Series Under Unnormalized Euclidean Distances
Authors: Jing Zhang, Daniel Nikovski
Abstract:
We propose an approximation algorithm called LINKUMP to compute the Pan Matrix Profile (PMP) under the unnormalized l∞ distance (useful for value-based similarity search) using double-ended queue and linear interpolation. The algorithm has comparable time/space complexities as the state-of-the-art algorithm for typical PMP computation under the normalized l₂ distance (useful for shape-based similarity search). We validate its efficiency and effectiveness through extensive numerical experiments and a real-world anomaly detection application.Keywords: pan matrix profile, unnormalized euclidean distance, double-ended queue, discord discovery, anomaly detection
Procedia PDF Downloads 21824559 Problems of Boolean Reasoning Based Biclustering Parallelization
Authors: Marcin Michalak
Abstract:
Biclustering is the way of two-dimensional data analysis. For several years it became possible to express such issue in terms of Boolean reasoning, for processing continuous, discrete and binary data. The mathematical backgrounds of such approach — proved ability of induction of exact and inclusion–maximal biclusters fulfilling assumed criteria — are strong advantages of the method. Unfortunately, the core of the method has quite high computational complexity. In the paper the basics of Boolean reasoning approach for biclustering are presented. In such context the problems of computation parallelization are risen.Keywords: Boolean reasoning, biclustering, parallelization, prime implicant
Procedia PDF Downloads 10124558 Symbolic Computation for the Multi-Soliton Solutions of a Class of Fifth-Order Evolution Equations
Authors: Rafat Alshorman, Fadi Awawdeh
Abstract:
By employing a simplified bilinear method, a class of generalized fifth-order KdV (gfKdV) equations which arise in nonlinear lattice, plasma physics and ocean dynamics are investigated. With the aid of symbolic computation, both solitary wave solutions and multiple-soliton solutions are obtained. These new exact solutions will extend previous results and help us explain the properties of nonlinear solitary waves in many physical models in shallow water. Parametric analysis is carried out in order to illustrate that the soliton amplitude, width and velocity are affected by the coefficient parameters in the equation.Keywords: multiple soliton solutions, fifth-order evolution equations, Cole-Hopf transformation, Hirota bilinear method
Procedia PDF Downloads 29624557 A Dynamic Ensemble Learning Approach for Online Anomaly Detection in Alibaba Datacenters
Authors: Wanyi Zhu, Xia Ming, Huafeng Wang, Junda Chen, Lu Liu, Jiangwei Jiang, Guohua Liu
Abstract:
Anomaly detection is a first and imperative step needed to respond to unexpected problems and to assure high performance and security in large data center management. This paper presents an online anomaly detection system through an innovative approach of ensemble machine learning and adaptive differentiation algorithms, and applies them to performance data collected from a continuous monitoring system for multi-tier web applications running in Alibaba data centers. We evaluate the effectiveness and efficiency of this algorithm with production traffic data and compare with the traditional anomaly detection approaches such as a static threshold and other deviation-based detection techniques. The experiment results show that our algorithm correctly identifies the unexpected performance variances of any running application, with an acceptable false positive rate. This proposed approach has already been deployed in real-time production environments to enhance the efficiency and stability in daily data center operations.Keywords: Alibaba data centers, anomaly detection, big data computation, dynamic ensemble learning
Procedia PDF Downloads 17324556 HPPDFIM-HD: Transaction Distortion and Connected Perturbation Approach for Hierarchical Privacy Preserving Distributed Frequent Itemset Mining over Horizontally-Partitioned Dataset
Authors: Fuad Ali Mohammed Al-Yarimi
Abstract:
Many algorithms have been proposed to provide privacy preserving in data mining. These protocols are based on two main approaches named as: the perturbation approach and the Cryptographic approach. The first one is based on perturbation of the valuable information while the second one uses cryptographic techniques. The perturbation approach is much more efficient with reduced accuracy while the cryptographic approach can provide solutions with perfect accuracy. However, the cryptographic approach is a much slower method and requires considerable computation and communication overhead. In this paper, a new scalable protocol is proposed which combines the advantages of the perturbation and distortion along with cryptographic approach to perform privacy preserving in distributed frequent itemset mining on horizontally distributed data. Both the privacy and performance characteristics of the proposed protocol are studied empirically.Keywords: anonymity data, data mining, distributed frequent itemset mining, gaussian perturbation, perturbation approach, privacy preserving data mining
Procedia PDF Downloads 48424555 An Interpretable Data-Driven Approach for the Stratification of the Cardiorespiratory Fitness
Authors: D.Mendes, J. Henriques, P. Carvalho, T. Rocha, S. Paredes, R. Cabiddu, R. Trimer, R. Mendes, A. Borghi-Silva, L. Kaminsky, E. Ashley, R. Arena, J. Myers
Abstract:
The continued exploration of clinically relevant predictive models continues to be an important pursuit. Cardiorespiratory fitness (CRF) portends clinical vital information and as such its accurate prediction is of high importance. Therefore, the aim of the current study was to develop a data-driven model, based on computational intelligence techniques and, in particular, clustering approaches, to predict CRF. Two prediction models were implemented and compared: 1) the traditional Wasserman/Hansen Equations; and 2) an interpretable clustering approach. Data used for this analysis were from the 'FRIEND - Fitness Registry and the Importance of Exercise: The National Data Base'; in the present study a subset of 10690 apparently healthy individuals were utilized. The accuracy of the models was performed through the computation of sensitivity, specificity, and geometric mean values. The results show the superiority of the clustering approach in the accurate estimation of CRF (i.e., maximal oxygen consumption).Keywords: cardiorespiratory fitness, data-driven models, knowledge extraction, machine learning
Procedia PDF Downloads 26324554 An Online Adaptive Thresholding Method to Classify Google Trends Data Anomalies for Investor Sentiment Analysis
Authors: Duygu Dere, Mert Ergeneci, Kaan Gokcesu
Abstract:
Google Trends data has gained increasing popularity in the applications of behavioral finance, decision science and risk management. Because of Google’s wide range of use, the Trends statistics provide significant information about the investor sentiment and intention, which can be used as decisive factors for corporate and risk management fields. However, an anomaly, a significant increase or decrease, in a certain query cannot be detected by the state of the art applications of computation due to the random baseline noise of the Trends data, which is modelled as an Additive white Gaussian noise (AWGN). Since through time, the baseline noise power shows a gradual change an adaptive thresholding method is required to track and learn the baseline noise for a correct classification. To this end, we introduce an online method to classify meaningful deviations in Google Trends data. Through extensive experiments, we demonstrate that our method can successfully classify various anomalies for plenty of different data.Keywords: adaptive data processing, behavioral finance , convex optimization, online learning, soft minimum thresholding
Procedia PDF Downloads 13924553 Functional and Efficient Query Interpreters: Principle, Application and Performances’ Comparison
Authors: Laurent Thiry, Michel Hassenforder
Abstract:
This paper presents a general approach to implement efficient queries’ interpreters in a functional programming language. Indeed, most of the standard tools actually available use an imperative and/or object-oriented language for the implementation (e.g. Java for Jena-Fuseki) but other paradigms are possible with, maybe, better performances. To proceed, the paper first explains how to model data structures and queries in a functional point of view. Then, it proposes a general methodology to get performances (i.e. number of computation steps to answer a query) then it explains how to integrate some optimization techniques (short-cut fusion and, more important, data transformations). It then compares the functional server proposed to a standard tool (Fuseki) demonstrating that the first one can be twice to ten times faster to answer queries.Keywords: data transformation, functional programming, information server, optimization
Procedia PDF Downloads 13224552 Brain Age Prediction Based on Brain Magnetic Resonance Imaging by 3D Convolutional Neural Network
Authors: Leila Keshavarz Afshar, Hedieh Sajedi
Abstract:
Estimation of biological brain age from MR images is a topic that has been much addressed in recent years due to the importance it attaches to early diagnosis of diseases such as Alzheimer's. In this paper, we use a 3D Convolutional Neural Network (CNN) to provide a method for estimating the biological age of the brain. The 3D-CNN model is trained by MRI data that has been normalized. In addition, to reduce computation while saving overall performance, some effectual slices are selected for age estimation. By this method, the biological age of individuals using selected normalized data was estimated with Mean Absolute Error (MAE) of 4.82 years.Keywords: brain age estimation, biological age, 3D-CNN, deep learning, T1-weighted image, SPM, preprocessing, MRI, canny, gray matter
Procedia PDF Downloads 121