Search results for: factorization of computation
510 Stochastic Control of Decentralized Singularly Perturbed Systems
Authors: Walid S. Alfuhaid, Saud A. Alghamdi, John M. Watkins, M. Edwin Sawan
Abstract:
Designing a controller for stochastic decentralized interconnected large scale systems usually involves a high degree of complexity and computation ability. Noise, observability, and controllability of all system states, connectivity, and channel bandwidth are other constraints to design procedures for distributed large scale systems. The quasi-steady state model investigated in this paper is a reduced order model of the original system using singular perturbation techniques. This paper results in an optimal control synthesis to design an observer based feedback controller by standard stochastic control theory techniques using Linear Quadratic Gaussian (LQG) approach and Kalman filter design with less complexity and computation requirements. Numerical example is given at the end to demonstrate the efficiency of the proposed method.Keywords: decentralized, optimal control, output, singular perturb
Procedia PDF Downloads 370509 Overcoming 4-to-1 Decryption Failure of the Rabin Cryptosystem
Authors: Muhammad Rezal Kamel Ariffin, Muhammad Asyraf Asbullah
Abstract:
The square root modulo problem is a known primitive in designing an asymmetric cryptosystem. It was first attempted by Rabin. Decryption failure of the Rabin cryptosystem caused by the 4-to-1 decryption output is overcome efficiently in this work. The proposed scheme to overcome the decryption failure issue (known as the AAβ-cryptosystem) is constructed using a simple mathematical structure, it has low computational requirements and would enable communication devices with low computing power to deploy secure communication procedures efficiently.Keywords: Rabin cryptosystem, 4-to-1 decryption failure, square root modulo problem, integer factorization problem
Procedia PDF Downloads 475508 Gaussian Mixture Model Based Identification of Arterial Wall Movement for Computation of Distension Waveform
Authors: Ravindra B. Patil, P. Krishnamoorthy, Shriram Sethuraman
Abstract:
This work proposes a novel Gaussian Mixture Model (GMM) based approach for accurate tracking of the arterial wall and subsequent computation of the distension waveform using Radio Frequency (RF) ultrasound signal. The approach was evaluated on ultrasound RF data acquired using a prototype ultrasound system from an artery mimicking flow phantom. The effectiveness of the proposed algorithm is demonstrated by comparing with existing wall tracking algorithms. The experimental results show that the proposed method provides 20% reduction in the error margin compared to the existing approaches in tracking the arterial wall movement. This approach coupled with ultrasound system can be used to estimate the arterial compliance parameters required for screening of cardiovascular related disorders.Keywords: distension waveform, Gaussian Mixture Model, RF ultrasound, arterial wall movement
Procedia PDF Downloads 506507 Numerical Computation of Sturm-Liouville Problem with Robin Boundary Condition
Authors: Theddeus T. Akano, Omotayo A. Fakinlede
Abstract:
The modelling of physical phenomena, such as the earth’s free oscillations, the vibration of strings, the interaction of atomic particles, or the steady state flow in a bar give rise to Sturm-Liouville (SL) eigenvalue problems. The boundary applications of some systems like the convection-diffusion equation, electromagnetic and heat transfer problems requires the combination of Dirichlet and Neumann boundary conditions. Hence, the incorporation of Robin boundary condition in the analyses of Sturm-Liouville problem. This paper deals with the computation of the eigenvalues and eigenfunction of generalized Sturm-Liouville problems with Robin boundary condition using the finite element method. Numerical solutions of classical Sturm–Liouville problems are presented. The results show an agreement with the exact solution. High results precision is achieved with higher number of elements.Keywords: Sturm-Liouville problem, Robin boundary condition, finite element method, eigenvalue problems
Procedia PDF Downloads 362506 Efficient DNN Training on Heterogeneous Clusters with Pipeline Parallelism
Abstract:
Pipeline parallelism has been widely used to accelerate distributed deep learning to alleviate GPU memory bottlenecks and to ensure that models can be trained and deployed smoothly under limited graphics memory conditions. However, in highly heterogeneous distributed clusters, traditional model partitioning methods are not able to achieve load balancing. The overlap of communication and computation is also a big challenge. In this paper, HePipe is proposed, an efficient pipeline parallel training method for highly heterogeneous clusters. According to the characteristics of the neural network model pipeline training task, oriented to the 2-level heterogeneous cluster computing topology, a training method based on the 2-level stage division of neural network modeling and partitioning is designed to improve the parallelism. Additionally, a multi-forward 1F1B scheduling strategy is designed to accelerate the training time of each stage by executing the computation units in advance to maximize the overlap between the forward propagation communication and backward propagation computation. Finally, a dynamic recomputation strategy based on task memory requirement prediction is proposed to improve the fitness ratio of task and memory, which improves the throughput of the cluster and solves the memory shortfall problem caused by memory differences in heterogeneous clusters. The empirical results show that HePipe improves the training speed by 1.6×−2.2× over the existing asynchronous pipeline baselines.Keywords: pipeline parallelism, heterogeneous cluster, model training, 2-level stage partitioning
Procedia PDF Downloads 18505 Scheduling Algorithm Based on Load-Aware Queue Partitioning in Heterogeneous Multi-Core Systems
Authors: Hong Kai, Zhong Jun Jie, Chen Lin Qi, Wang Chen Guang
Abstract:
There are inefficient global scheduling parallelism and local scheduling parallelism prone to processor starvation in current scheduling algorithms. Regarding this issue, this paper proposed a load-aware queue partitioning scheduling strategy by first allocating the queues according to the number of processor cores, calculating the load factor to specify the load queue capacity, and it assigned the awaiting nodes to the appropriate perceptual queues through the precursor nodes and the communication computation overhead. At the same time, real-time computation of the load factor could effectively prevent the processor from being starved for a long time. Experimental comparison with two classical algorithms shows that there is a certain improvement in both performance metrics of scheduling length and task speedup ratio.Keywords: load-aware, scheduling algorithm, perceptual queue, heterogeneous multi-core
Procedia PDF Downloads 145504 Core Number Optimization Based Scheduler to Order/Mapp Simulink Application
Authors: Asma Rebaya, Imen Amari, Kaouther Gasmi, Salem Hasnaoui
Abstract:
Over these last years, the number of cores witnessed a spectacular increase in digital signal and general use processors. Concurrently, significant researches are done to get benefit from the high degree of parallelism. Indeed, these researches are focused to provide an efficient scheduling from hardware/software systems to multicores architecture. The scheduling process consists on statically choose one core to execute one task and to specify an execution order for the application tasks. In this paper, we describe an efficient scheduler that calculates the optimal number of cores required to schedule an application, gives a heuristic scheduling solution and evaluates its cost. Our proposal results are evaluated and compared with Preesm scheduler results and we prove that ours allows better scheduling in terms of latency, computation time and number of cores.Keywords: computation time, hardware/software system, latency, optimization, multi-cores platform, scheduling
Procedia PDF Downloads 283503 Number of Necessary Parameters for Parametrization of Stabilizing Controllers for two times two RHinf Systems
Authors: Kazuyoshi Mori
Abstract:
In this paper, we consider the number of parameters for the parametrization of stabilizing controllers for RHinf systems with size 2 × 2. Fortunately, any plant of this model can admit doubly coprime factorization. Thus we can use the Youla parameterization to parametrize the stabilizing contollers . However, Youla parameterization does not give itself the minimal number of parameters. This paper shows that the minimal number of parameters is four. As a result, we show that the Youla parametrization naturally gives the parameterization of stabilizing controllers with minimal numbers.Keywords: RHinfo, parameterization, number of parameters, multi-input, multi-output systems
Procedia PDF Downloads 407502 Model and Algorithm for Dynamic Wireless Electric Vehicle Charging Network Design
Authors: Trung Hieu Tran, Jesse O'Hanley, Russell Fowler
Abstract:
When in-wheel wireless charging technology for electric vehicles becomes mature, a need for such integrated charging stations network development is essential. In this paper, we thus investigate the optimisation problem of in-wheel wireless electric vehicle charging network design. A mixed-integer linear programming model is formulated to solve into optimality the problem. In addition, a meta-heuristic algorithm is proposed for efficiently solving large-sized instances within a reasonable computation time. A parallel computing strategy is integrated into the algorithm to speed up its computation time. Experimental results carried out on the benchmark instances show that our model and algorithm can find the optimal solutions and their potential for practical applications.Keywords: electric vehicle, wireless charging station, mathematical programming, meta-heuristic algorithm, parallel computing
Procedia PDF Downloads 79501 A Deterministic Large Deviation Model Based on Complex N-Body Systems
Authors: David C. Ni
Abstract:
In the previous efforts, we constructed N-Body Systems by an extended Blaschke product (EBP), which represents a non-temporal and nonlinear extension of Lorentz transformation. In this construction, we rely only on two parameters, nonlinear degree, and relative momentum to characterize the systems. We further explored root computation via iteration with an algorithm extended from Jenkins-Traub method. The solution sets demonstrate a form of σ+ i [-t, t], where σ and t are the real numbers, and the [-t, t] shows various canonical distributions. In this paper, we correlate the convergent sets in the original domain with solution sets, which demonstrating large-deviation distributions in the codomain. We proceed to compare our approach with the formula or principles, such as Donsker-Varadhan and Wentzell-Freidlin theories. The deterministic model based on this construction allows us to explore applications in the areas of finance and statistical mechanics.Keywords: nonlinear Lorentz transformation, Blaschke equation, iteration solutions, root computation, large deviation distribution, deterministic model
Procedia PDF Downloads 393500 Extracting Opinions from Big Data of Indonesian Customer Reviews Using Hadoop MapReduce
Authors: Veronica S. Moertini, Vinsensius Kevin, Gede Karya
Abstract:
Customer reviews have been collected by many kinds of e-commerce websites selling products, services, hotel rooms, tickets and so on. Each website collects its own customer reviews. The reviews can be crawled, collected from those websites and stored as big data. Text analysis techniques can be used to analyze that data to produce summarized information, such as customer opinions. Then, these opinions can be published by independent service provider websites and used to help customers in choosing the most suitable products or services. As the opinions are analyzed from big data of reviews originated from many websites, it is expected that the results are more trusted and accurate. Indonesian customers write reviews in Indonesian language, which comes with its own structures and uniqueness. We found that most of the reviews are expressed with “daily language”, which is informal, do not follow the correct grammar, have many abbreviations and slangs or non-formal words. Hadoop is an emerging platform aimed for storing and analyzing big data in distributed systems. A Hadoop cluster consists of master and slave nodes/computers operated in a network. Hadoop comes with distributed file system (HDFS) and MapReduce framework for supporting parallel computation. However, MapReduce has weakness (i.e. inefficient) for iterative computations, specifically, the cost of reading/writing data (I/O cost) is high. Given this fact, we conclude that MapReduce function is best adapted for “one-pass” computation. In this research, we develop an efficient technique for extracting or mining opinions from big data of Indonesian reviews, which is based on MapReduce with one-pass computation. In designing the algorithm, we avoid iterative computation and instead adopt a “look up table” technique. The stages of the proposed technique are: (1) Crawling the data reviews from websites; (2) cleaning and finding root words from the raw reviews; (3) computing the frequency of the meaningful opinion words; (4) analyzing customers sentiments towards defined objects. The experiments for evaluating the performance of the technique were conducted on a Hadoop cluster with 14 slave nodes. The results show that the proposed technique (stage 2 to 4) discovers useful opinions, is capable of processing big data efficiently and scalable.Keywords: big data analysis, Hadoop MapReduce, analyzing text data, mining Indonesian reviews
Procedia PDF Downloads 201499 Cognitive Science Based Scheduling in Grid Environment
Authors: N. D. Iswarya, M. A. Maluk Mohamed, N. Vijaya
Abstract:
Grid is infrastructure that allows the deployment of distributed data in large size from multiple locations to reach a common goal. Scheduling data intensive applications becomes challenging as the size of data sets are very huge in size. Only two solutions exist in order to tackle this challenging issue. First, computation which requires huge data sets to be processed can be transferred to the data site. Second, the required data sets can be transferred to the computation site. In the former scenario, the computation cannot be transferred since the servers are storage/data servers with little or no computational capability. Hence, the second scenario can be considered for further exploration. During scheduling, transferring huge data sets from one site to another site requires more network bandwidth. In order to mitigate this issue, this work focuses on incorporating cognitive science in scheduling. Cognitive Science is the study of human brain and its related activities. Current researches are mainly focused on to incorporate cognitive science in various computational modeling techniques. In this work, the problem solving approach of human brain is studied and incorporated during the data intensive scheduling in grid environments. Here, a cognitive engine is designed and deployed in various grid sites. The intelligent agents present in CE will help in analyzing the request and creating the knowledge base. Depending upon the link capacity, decision will be taken whether to transfer data sets or to partition the data sets. Prediction of next request is made by the agents to serve the requesting site with data sets in advance. This will reduce the data availability time and data transfer time. Replica catalog and Meta data catalog created by the agents assist in decision making process.Keywords: data grid, grid workflow scheduling, cognitive artificial intelligence
Procedia PDF Downloads 394498 FISCEAPP: FIsh Skin Color Evaluation APPlication
Authors: J. Urban, Á. S. Botella, L. E. Robaina, A. Bárta, P. Souček, P. Císař, Š. Papáček, L. M. Domínguez
Abstract:
Skin coloration in fish is of great physiological, behavioral and ecological importance and can be considered as an index of animal welfare in aquaculture as well as an important quality factor in the retail value. Currently, in order to compare color in animals fed on different diets, biochemical analysis, and colorimetry of fished, mildly anesthetized or dead body, are very accurate and meaningful measurements. The noninvasive method using digital images of the fish body was developed as a standalone application. This application deals with the computation burden and memory consumption of large input files, optimizing piece wise processing and analysis with the memory/computation time ratio. For the comparison of color distributions of various experiments and different color spaces (RGB, CIE L*a*b*) the comparable semi-equidistant binning of multi channels representation is introduced. It is derived from the knowledge of quantization levels and Freedman-Diaconis rule. The color calibrations and camera responsivity function were necessary part of the measurement process.Keywords: color distribution, fish skin color, piecewise transformation, object to background segmentation
Procedia PDF Downloads 262497 Electromagnetic Wave Propagation Equations in 2D by Finite Difference Method
Authors: N. Fusun Oyman Serteller
Abstract:
In this paper, the techniques to solve time dependent electromagnetic wave propagation equations based on the Finite Difference Method (FDM) are proposed by comparing the results with Finite Element Method (FEM) in 2D while discussing some special simulation examples. Here, 2D dynamical wave equations for lossy media, even with a constant source, are discussed for establishing symbolic manipulation of wave propagation problems. The main objective of this contribution is to introduce a comparative study of two suitable numerical methods and to show that both methods can be applied effectively and efficiently to all types of wave propagation problems, both linear and nonlinear cases, by using symbolic computation. However, the results show that the FDM is more appropriate for solving the nonlinear cases in the symbolic solution. Furthermore, some specific complex domain examples of the comparison of electromagnetic waves equations are considered. Calculations are performed through Mathematica software by making some useful contribution to the programme and leveraging symbolic evaluations of FEM and FDM.Keywords: finite difference method, finite element method, linear-nonlinear PDEs, symbolic computation, wave propagation equations
Procedia PDF Downloads 147496 3-D Visualization and Optimization for SISO Linear Systems Using Parametrization of Two-Stage Compensator Design
Authors: Kazuyoshi Mori, Keisuke Hashimoto
Abstract:
In this paper, we consider the two-stage compensator designs of SISO plants. As an investigation of the characteristics of the two-stage compensator designs, which is not well investigated yet, of SISO plants, we implement three dimensional visualization systems of output signals and optimization system for SISO plants by the parametrization of stabilizing controllers based on the two-stage compensator design. The system runs on Mathematica by using “Three Dimensional Surface Plots,” so that the visualization can be interactively manipulated by users. In this paper, we use the discrete-time LTI system model. Even so, our approach is the factorization approach, so that the result can be applied to many linear models.Keywords: linear systems, visualization, optimization, Mathematica
Procedia PDF Downloads 298495 Video Heart Rate Measurement for the Detection of Trauma-Related Stress States
Authors: Jarek Krajewski, David Daxberger, Luzi Beyer
Abstract:
Finding objective and non-intrusive measurements of emotional and psychopathological states (e.g., post-traumatic stress disorder, PTSD) is an important challenge. Thus, the proposed approach here uses Photoplethysmographic imaging (PPGI) applying facial RGB Cam videos to estimate heart rate levels. A pipeline for the signal processing of the raw image has been proposed containing different preprocessing approaches, e.g., Independent Component Analysis, Non-negative Matrix factorization, and various other artefact correction approaches. Under resting and constant light conditions, we reached a sensitivity of 84% for pulse peak detection. The results indicate that PPGI can be a suitable solution for providing heart rate data derived from these indirectly post-traumatic stress states.Keywords: heart rate, PTSD, PPGI, stress, preprocessing
Procedia PDF Downloads 124494 An Efficient Approach for Speed up Non-Negative Matrix Factorization for High Dimensional Data
Authors: Bharat Singh Om Prakash Vyas
Abstract:
Now a day’s applications deal with High Dimensional Data have tremendously used in the popular areas. To tackle with such kind of data various approached has been developed by researchers in the last few decades. To tackle with such kind of data various approached has been developed by researchers in the last few decades. One of the problems with the NMF approaches, its randomized valued could not provide absolute optimization in limited iteration, but having local optimization. Due to this, we have proposed a new approach that considers the initial values of the decomposition to tackle the issues of computationally expensive. We have devised an algorithm for initializing the values of the decomposed matrix based on the PSO (Particle Swarm Optimization). Through the experimental result, we will show the proposed method converse very fast in comparison to other row rank approximation like simple NMF multiplicative, and ACLS techniques.Keywords: ALS, NMF, high dimensional data, RMSE
Procedia PDF Downloads 342493 Artificial Reproduction System and Imbalanced Dataset: A Mendelian Classification
Authors: Anita Kushwaha
Abstract:
We propose a new evolutionary computational model called Artificial Reproduction System which is based on the complex process of meiotic reproduction occurring between male and female cells of the living organisms. Artificial Reproduction System is an attempt towards a new computational intelligence approach inspired by the theoretical reproduction mechanism, observed reproduction functions, principles and mechanisms. A reproductive organism is programmed by genes and can be viewed as an automaton, mapping and reducing so as to create copies of those genes in its off springs. In Artificial Reproduction System, the binding mechanism between male and female cells is studied, parameters are chosen and a network is constructed also a feedback system for self regularization is established. The model then applies Mendel’s law of inheritance, allele-allele associations and can be used to perform data analysis of imbalanced data, multivariate, multiclass and big data. In the experimental study Artificial Reproduction System is compared with other state of the art classifiers like SVM, Radial Basis Function, neural networks, K-Nearest Neighbor for some benchmark datasets and comparison results indicates a good performance.Keywords: bio-inspired computation, nature- inspired computation, natural computing, data mining
Procedia PDF Downloads 272492 Towards a Distributed Computation Platform Tailored for Educational Process Discovery and Analysis
Authors: Awatef Hicheur Cairns, Billel Gueni, Hind Hafdi, Christian Joubert, Nasser Khelifa
Abstract:
Given the ever changing needs of the job markets, education and training centers are increasingly held accountable for student success. Therefore, education and training centers have to focus on ways to streamline their offers and educational processes in order to achieve the highest level of quality in curriculum contents and managerial decisions. Educational process mining is an emerging field in the educational data mining (EDM) discipline, concerned with developing methods to discover, analyze and provide a visual representation of complete educational processes. In this paper, we present our distributed computation platform which allows different education centers and institutions to load their data and access to advanced data mining and process mining services. To achieve this, we present also a comparative study of the different clustering techniques developed in the context of process mining to partition efficiently educational traces. Our goal is to find the best strategy for distributing heavy analysis computations on many processing nodes of our platform.Keywords: educational process mining, distributed process mining, clustering, distributed platform, educational data mining, ProM
Procedia PDF Downloads 454491 An Application of Sinc Function to Approximate Quadrature Integrals in Generalized Linear Mixed Models
Authors: Altaf H. Khan, Frank Stenger, Mohammed A. Hussein, Reaz A. Chaudhuri, Sameera Asif
Abstract:
This paper discusses a novel approach to approximate quadrature integrals that arise in the estimation of likelihood parameters for the generalized linear mixed models (GLMM) as well as Bayesian methodology also requires computation of multidimensional integrals with respect to the posterior distributions in which computation are not only tedious and cumbersome rather in some situations impossible to find solutions because of singularities, irregular domains, etc. An attempt has been made in this work to apply Sinc function based quadrature rules to approximate intractable integrals, as there are several advantages of using Sinc based methods, for example: order of convergence is exponential, works very well in the neighborhood of singularities, in general quite stable and provide high accurate and double precisions estimates. The Sinc function based approach seems to be utilized first time in statistical domain to our knowledge, and it's viability and future scopes have been discussed to apply in the estimation of parameters for GLMM models as well as some other statistical areas.Keywords: generalized linear mixed model, likelihood parameters, qudarature, Sinc function
Procedia PDF Downloads 395490 Private Coded Computation of Matrix Multiplication
Authors: Malihe Aliasgari, Yousef Nejatbakhsh
Abstract:
The era of Big Data and the immensity of real-life datasets compels computation tasks to be performed in a distributed fashion, where the data is dispersed among many servers that operate in parallel. However, massive parallelization leads to computational bottlenecks due to faulty servers and stragglers. Stragglers refer to a few slow or delay-prone processors that can bottleneck the entire computation because one has to wait for all the parallel nodes to finish. The problem of straggling processors, has been well studied in the context of distributed computing. Recently, it has been pointed out that, for the important case of linear functions, it is possible to improve over repetition strategies in terms of the tradeoff between performance and latency by carrying out linear precoding of the data prior to processing. The key idea is that, by employing suitable linear codes operating over fractions of the original data, a function may be completed as soon as enough number of processors, depending on the minimum distance of the code, have completed their operations. The problem of matrix-matrix multiplication in the presence of practically big sized of data sets faced with computational and memory related difficulties, which makes such operations are carried out using distributed computing platforms. In this work, we study the problem of distributed matrix-matrix multiplication W = XY under storage constraints, i.e., when each server is allowed to store a fixed fraction of each of the matrices X and Y, which is a fundamental building of many science and engineering fields such as machine learning, image and signal processing, wireless communication, optimization. Non-secure and secure matrix multiplication are studied. We want to study the setup, in which the identity of the matrix of interest should be kept private from the workers and then obtain the recovery threshold of the colluding model, that is, the number of workers that need to complete their task before the master server can recover the product W. The problem of secure and private distributed matrix multiplication W = XY which the matrix X is confidential, while matrix Y is selected in a private manner from a library of public matrices. We present the best currently known trade-off between communication load and recovery threshold. On the other words, we design an achievable PSGPD scheme for any arbitrary privacy level by trivially concatenating a robust PIR scheme for arbitrary colluding workers and private databases and the proposed SGPD code that provides a smaller computational complexity at the workers.Keywords: coded distributed computation, private information retrieval, secret sharing, stragglers
Procedia PDF Downloads 122489 SIF Computation of Cracked Plate by FEM
Authors: Sari Elkahina, Zergoug Mourad, Benachenhou Kamel
Abstract:
The main purpose of this paper is to perform a computations comparison of stress intensity factor 'SIF' evaluation in case of cracked thin plate with Aluminum alloy 7075-T6 and 2024-T3 used in aeronautics structure under uniaxial loading. This evaluation is based on finite element method with a virtual power principle through two techniques: the extrapolation and G−θ. The first one consists to extrapolate the nodal displacements near the cracked tip using a refined triangular mesh with T3 and T6 special elements, while the second, consists of determining the energy release rate G through G−θ method by potential energy derivation which corresponds numerically to the elastic solution post-processing of a cracked solid by a contour integration computation via Gauss points. The SIF obtained results from extrapolation and G−θ methods will be compared to an analytical solution in a particular case. To illustrate the influence of the meshing kind and the size of integration contour position simulations are presented and analyzed.Keywords: crack tip, SIF, finite element method, concentration technique, displacement extrapolation, aluminum alloy 7075-T6 and 2024-T3, energy release rate G, G-θ method, Gauss point numerical integration
Procedia PDF Downloads 337488 On Direct Matrix Factored Inversion via Broyden's Updates
Authors: Adel Mohsen
Abstract:
A direct method based on the good Broyden's updates for evaluating the inverse of a nonsingular square matrix of full rank and solving related system of linear algebraic equations is studied. For a matrix A of order n whose LU-decomposition is A = LU, the multiplication count is O (n3). This includes the evaluation of the LU-decompositions of the inverse, the lower triangular decomposition of A as well as a “reduced matrix inverse”. If an explicit value of the inverse is not needed the order reduces to O (n3/2) to compute to compute inv(U) and the reduced inverse. For a symmetric matrix only O (n3/3) operations are required to compute inv(L) and the reduced inverse. An example is presented to demonstrate the capability of using the reduced matrix inverse in treating ill-conditioned systems. Besides the simplicity of Broyden's update, the method provides a mean to exploit the possible sparsity in the matrix and to derive a suitable preconditioner.Keywords: Broyden's updates, matrix inverse, inverse factorization, solution of linear algebraic equations, ill-conditioned matrices, preconditioning
Procedia PDF Downloads 479487 Lecture Video Indexing and Retrieval Using Topic Keywords
Authors: B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa
Abstract:
In this paper, we propose a framework to help users to search and retrieve the portions in the lecture video of their interest. This is achieved by temporally segmenting and indexing the lecture video using the topic keywords. We use transcribed text from the video and documents relevant to the video topic extracted from the web for this purpose. The keywords for indexing are found by applying the non-negative matrix factorization (NMF) topic modeling techniques on the web documents. Our proposed technique first creates indices on the transcribed documents using the topic keywords, and these are mapped to the video to find the start and end time of the portions of the video for a particular topic. This time information is stored in the index table along with the topic keyword which is used to retrieve the specific portions of the video for the query provided by the users.Keywords: video indexing and retrieval, lecture videos, content based video search, multimodal indexing
Procedia PDF Downloads 250486 Presenting the Mathematical Model to Determine Retention in the Watersheds
Authors: S. Shamohammadi, L. Razavi
Abstract:
This paper based on the principle concepts of SCS-CN model, a new mathematical model for computation of retention potential (S) presented. In the mathematical model, not only precipitation-runoff concepts in SCS-CN model are precisely represented in a mathematical form, but also new concepts, called “maximum retention” and “total retention” is introduced, and concepts of potential retention capacity, maximum retention, and total retention have been separated from each other. In the proposed model, actual retention (F), maximum actual retention (Fmax), total retention (S), maximum retention (Smax), and potential retention (Sp), for the first time clearly defined, so that Sp is not variable, but a function of morphological characteristics of the watershed. Indeed, based on the mathematical relation of the conceptual curve of SCS-CN model, the proposed model provides a new method for the computation of actual retention in watershed and it simply determined runoff based on. In the corresponding relations, in addition to Precipitation (P), Initial retention (Ia), cumulative values of actual retention capacity (F), total retention (S), runoff (Q), antecedent moisture (M), potential retention (Sp), total retention (S), we introduced Fmax and Fmin referring to maximum and minimum actual retention, respectively. As well as, ksh is a coefficient which depends on morphological characteristics of the watershed. Advantages of the modified version versus the original model include a better precision, higher performance, easier calibration and speed computing.Keywords: model, mathematical, retention, watershed, SCS
Procedia PDF Downloads 457485 Operator Optimization Based on Hardware Architecture Alignment Requirements
Authors: Qingqing Gai, Junxing Shen, Yu Luo
Abstract:
Due to the hardware architecture characteristics, some operators tend to acquire better performance if the input/output tensor dimensions are aligned to a certain minimum granularity, such as convolution and deconvolution commonly used in deep learning. Furthermore, if the requirements are not met, the general strategy is to pad with 0 to satisfy the requirements, potentially leading to the under-utilization of the hardware resources. Therefore, for the convolution and deconvolution whose input and output channels do not meet the minimum granularity alignment, we propose to transfer the W-dimensional data to the C-dimension for computation (W2C) to enable the C-dimension to meet the hardware requirements. This scheme also reduces the number of computations in the W-dimension. Although this scheme substantially increases computation, the operator’s speed can improve significantly. It achieves remarkable speedups on multiple hardware accelerators, including Nvidia Tensor cores, Qualcomm digital signal processors (DSPs), and Huawei neural processing units (NPUs). All you need to do is modify the network structure and rearrange the operator weights offline without retraining. At the same time, for some operators, such as the Reducemax, we observe that transferring the Cdimensional data to the W-dimension(C2W) and replacing the Reducemax with the Maxpool can accomplish acceleration under certain circumstances.Keywords: convolution, deconvolution, W2C, C2W, alignment, hardware accelerator
Procedia PDF Downloads 104484 Safety Approach Highway Alignment Optimization
Authors: Seyed Abbas Tabatabaei, Marjan Naderan Tahan, Arman Kadkhodai
Abstract:
An efficient optimization approach, called feasible gate (FG), is developed to enhance the computation efficiency and solution quality of the previously developed highway alignment optimization (HAO) model. This approach seeks to realistically represent various user preferences and environmentally sensitive areas and consider them along with geometric design constraints in the optimization process. This is done by avoiding the generation of infeasible solutions that violate various constraints and thus focusing the search on the feasible solutions. The proposed method is simple, but improves significantly the model’s computation time and solution quality. On the other, highway alignment optimization through Feasible Gates, eventuates only economic model by considering minimum design constrains includes minimum reduce of circular curves, minimum length of vertical curves and road maximum gradient. This modelling can reduce passenger comfort and road safety. In most of highway optimization models, by adding penalty function for each constraint, final result handles to satisfy minimum constraint. In this paper, we want to propose a safety-function solution by introducing gift function.Keywords: safety, highway geometry, optimization, alignment
Procedia PDF Downloads 410483 Analyzing the Factors that Cause Parallel Performance Degradation in Parallel Graph-Based Computations Using Graph500
Authors: Mustafa Elfituri, Jonathan Cook
Abstract:
Recently, graph-based computations have become more important in large-scale scientific computing as they can provide a methodology to model many types of relations between independent objects. They are being actively used in fields as varied as biology, social networks, cybersecurity, and computer networks. At the same time, graph problems have some properties such as irregularity and poor locality that make their performance different than regular applications performance. Therefore, parallelizing graph algorithms is a hard and challenging task. Initial evidence is that standard computer architectures do not perform very well on graph algorithms. Little is known exactly what causes this. The Graph500 benchmark is a representative application for parallel graph-based computations, which have highly irregular data access and are driven more by traversing connected data than by computation. In this paper, we present results from analyzing the performance of various example implementations of Graph500, including a shared memory (OpenMP) version, a distributed (MPI) version, and a hybrid version. We measured and analyzed all the factors that affect its performance in order to identify possible changes that would improve its performance. Results are discussed in relation to what factors contribute to performance degradation.Keywords: graph computation, graph500 benchmark, parallel architectures, parallel programming, workload characterization.
Procedia PDF Downloads 147482 Scalable Systolic Multiplier over Binary Extension Fields Based on Two-Level Karatsuba Decomposition
Authors: Chiou-Yng Lee, Wen-Yo Lee, Chieh-Tsai Wu, Cheng-Chen Yang
Abstract:
Shifted polynomial basis (SPB) is a variation of polynomial basis representation. SPB has potential for efficient bit-level and digit-level implementations of multiplication over binary extension fields with subquadratic space complexity. For efficient implementation of pairing computation with large finite fields, this paper presents a new SPB multiplication algorithm based on Karatsuba schemes, and used that to derive a novel scalable multiplier architecture. Analytical results show that the proposed multiplier provides a trade-off between space and time complexities. Our proposed multiplier is modular, regular, and suitable for very-large-scale integration (VLSI) implementations. It involves less area complexity compared to the multipliers based on traditional decomposition methods. It is therefore, more suitable for efficient hardware implementation of pairing based cryptography and elliptic curve cryptography (ECC) in constraint driven applications.Keywords: digit-serial systolic multiplier, elliptic curve cryptography (ECC), Karatsuba algorithm (KA), shifted polynomial basis (SPB), pairing computation
Procedia PDF Downloads 362481 Computation of Radiotherapy Treatment Plans Based on CT to ED Conversion Curves
Authors: B. Petrović, L. Rutonjski, M. Baucal, M. Teodorović, O. Čudić, B. Basarić
Abstract:
Radiotherapy treatment planning computers use CT data of the patient. For the computation of a treatment plan, treatment planning system must have an information on electron densities of tissues scanned by CT. This information is given by the conversion curve CT (CT number) to ED (electron density), or simply calibration curve. Every treatment planning system (TPS) has built in default CT to ED conversion curves, for the CTs of different manufacturers. However, it is always recommended to verify the CT to ED conversion curve before actual clinical use. Objective of this study was to check how the default curve already provided matches the curve actually measured on a specific CT, and how much it influences the calculation of a treatment planning computer. The examined CT scanners were from the same manufacturer, but four different scanners from three generations. The measurements of all calibration curves were done with the dedicated phantom CIRS 062M Electron Density Phantom. The phantom was scanned, and according to real HU values read at the CT console computer, CT to ED conversion curves were generated for different materials, for same tube voltage 140 kV. Another phantom, CIRS Thorax 002 LFC which represents an average human torso in proportion, density and two-dimensional structure, was used for verification. The treatment planning was done on CT slices of scanned CIRS LFC 002 phantom, for selected cases. Interest points were set in the lungs, and in the spinal cord, and doses recorded in TPS. The overall calculated treatment times for four scanners and default scanner did not differ more than 0.8%. Overall interest point dose in bone differed max 0.6% while for single fields was maximum 2.7% (lateral field). Overall interest point dose in lungs differed max 1.1% while for single fields was maximum 2.6% (lateral field). It is known that user should verify the CT to ED conversion curve, but often, developing countries are facing lack of QA equipment, and often use default data provided. We have concluded that the CT to ED curves obtained differ in certain points of a curve, generally in the region of higher densities. This influences the treatment planning result which is not significant, but definitely does make difference in the calculated dose.Keywords: Computation of treatment plan, conversion curve, radiotherapy, electron density
Procedia PDF Downloads 486