Search results for: Parallel platform.
1086 Parallel Vector Processing Using Multi Level Orbital DATA
Authors: Nagi Mekhiel
Abstract:
Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.Keywords: Memory organization, parallel processors, serial code, vector processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10631085 Analyzing the Factors that Cause Parallel Performance Degradation in Parallel Graph-Based Computations Using Graph500
Authors: Mustafa Elfituri, Jonathan Cook
Abstract:
Recently, graph-based computations have become more important in large-scale scientific computing as they can provide a methodology to model many types of relations between independent objects. They are being actively used in fields as varied as biology, social networks, cybersecurity, and computer networks. At the same time, graph problems have some properties such as irregularity and poor locality that make their performance different than regular applications performance. Therefore, parallelizing graph algorithms is a hard and challenging task. Initial evidence is that standard computer architectures do not perform very well on graph algorithms. Little is known exactly what causes this. The Graph500 benchmark is a representative application for parallel graph-based computations, which have highly irregular data access and are driven more by traversing connected data than by computation. In this paper, we present results from analyzing the performance of various example implementations of Graph500, including a shared memory (OpenMP) version, a distributed (MPI) version, and a hybrid version. We measured and analyzed all the factors that affect its performance in order to identify possible changes that would improve its performance. Results are discussed in relation to what factors contribute to performance degradation.
Keywords: Graph computation, Graph500 benchmark, parallel architectures, parallel programming, workload characterization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5481084 A Survey on Performance Tools for OpenMP
Authors: Mubarak S. Mohsen, Rosni Abdullah, Yong M. Teo
Abstract:
Advances in processors architecture, such as multicore, increase the size of complexity of parallel computer systems. With multi-core architecture there are different parallel languages that can be used to run parallel programs. One of these languages is OpenMP which embedded in C/Cµ or FORTRAN. Because of this new architecture and the complexity, it is very important to evaluate the performance of OpenMP constructs, kernels, and application program on multi-core systems. Performance is the activity of collecting the information about the execution characteristics of a program. Performance tools consists of at least three interfacing software layers, including instrumentation, measurement, and analysis. The instrumentation layer defines the measured performance events. The measurement layer determines what performance event is actually captured and how it is measured by the tool. The analysis layer processes the performance data and summarizes it into a form that can be displayed in performance tools. In this paper, a number of OpenMP performance tools are surveyed, explaining how each is used to collect, analyse, and display data collection.Keywords: Parallel performance tools, OpenMP, multi-core.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19231083 The Effectiveness of University’s Strategic Plan for Sustainability through Collaborative Platform’s Deliberation Matrix
Authors: Ashiquer Rahman
Abstract:
The paper focuses on the significance of the university's sustainability strategic plan and emphasizes the usefulness of the collaborative platform-based deliberation matrix. It will equip the university's leadership to handle impending tactics and challenges with the sustainability of the university’s strategic plan. The study addresses the significance of a set of reference points that will precede operational activities for multi-stakeholder multi-criteria evaluation on the optimal standards of Sustainable University, as well as potential action for the strategic blueprint of Sustainable University. It makes reference to the university’s sustainability strategy plan’s effectiveness through a collaborative platform and deliberation matrix. The paper outlines the conceptual framing of a sustainable university by implementing a strategic plan over the collaborative platform and deliberation matrix. Optimistically, these will be a milestone in higher education; a pathway to prepare for the University’s upcoming implementation of its sustainability strategy. In fact, the collaborative platform and deliberation matrix both are enhancement needles for institutional cooperation to the completive world.
Keywords: Sustainable strategies, institutional cooperation, multi-stakeholder multi-criteria assessment, collaborative platform, innovative method and tools.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 841082 Isotropic Stress Distribution in Cu/(001) Fe Two Sheets
Authors: A. Derardja, L. Baroura, M. Brioua
Abstract:
The nanotechnology based on epitaxial systems includes single or arranged misfit dislocations. In general, whatever is the type of dislocation or the geometry of the array formed by the dislocations; it is important for experimental studies to know exactly the stress distribution for which there is no analytical expression [1, 2]. This work, using a numerical analysis, deals with relaxation of epitaxial layers having at their interface a periodic network of edge misfit dislocations. The stress distribution is estimated by using isotropic elasticity. The results show that the thickness of the two sheets is a crucial parameter in the stress distributions and then in the profile of the two sheets. A comparative study between the case of single dislocation and the case of parallel network shows that the layers relaxed better when the interface is covered by a parallel arrangement of misfit. Consequently, a single dislocation at the interface produces an important stress field which can be reduced by inserting a parallel network of dislocations with suitable periodicity.Keywords: Parallel array of misfit, interface, isotropic elasticity, single crystalline substrates, coherent interface
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15721081 On Fault Diagnosis of Asynchronous Sequential Machines with Parallel Composition
Authors: Jung-Min Yang
Abstract:
Fault diagnosis of composite asynchronous sequential machines with parallel composition is addressed in this paper. An adversarial input can infiltrate one of two submachines comprising the composite asynchronous machine, causing an unauthorized state transition. The objective is to characterize the condition under which the controller can diagnose any fault occurrence. Two control configurations, state feedback and output feedback, are considered in this paper. In the case of output feedback, the exact estimation of the state is impossible since the current state is inaccessible and the output feedback is given as the form of burst. A simple example is provided to demonstrate the proposed methodology.Keywords: Asynchronous sequential machines, parallel composition, fault diagnosis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9731080 Conditions for Fault Recovery of Interconnected Asynchronous Sequential Machines with State Feedback
Authors: Jung–Min Yang
Abstract:
In this paper, fault recovery for parallel interconnected asynchronous sequential machines is studied. An adversarial input can infiltrate into one of two submachines comprising parallel composition of the considered asynchronous sequential machine, causing an unauthorized state transition. The control objective is to elucidate the condition for the existence of a corrective controller that makes the closed-loop system immune against any occurrence of adversarial inputs. In particular, an efficient existence condition is presented that does not need the complete modeling of the interconnected asynchronous sequential machine.Keywords: Asynchronous sequential machines, parallel composition, corrective control, fault tolerance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8391079 Hybrid Prefix Adder Architecture for Minimizing the Power Delay Product
Authors: P.Ramanathan, P.T.Vanathi
Abstract:
Parallel Prefix addition is a technique for improving the speed of binary addition. Due to continuing integrating intensity and the growing needs of portable devices, low-power and highperformance designs are of prime importance. The classical parallel prefix adder structures presented in the literature over the years optimize for logic depth, area, fan-out and interconnect count of logic circuits. In this paper, a new architecture for performing 8-bit, 16-bit and 32-bit Parallel Prefix addition is proposed. The proposed prefix adder structures is compared with several classical adders of same bit width in terms of power, delay and number of computational nodes. The results reveal that the proposed structures have the least power delay product when compared with its peer existing Prefix adder structures. Tanner EDA tool was used for simulating the adder designs in the TSMC 180 nm and TSMC 130 nm technologies.Keywords: Parallel Prefix Adder (PPA), Dot operator, Semi-Dotoperator, Complementary Metal Oxide Semiconductor (CMOS), Odd-dot operator, Even-dot operator, Odd-semi-dot operator andEven-semi-dot operator.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17261078 Modified Scaling-Free CORDIC Based Pipelined Parallel MDC FFT and IFFT Architecture for Radix 2^2 Algorithm
Authors: C. Paramasivam, K. B. Jayanthi
Abstract:
An innovative approach to develop modified scaling free CORDIC based two parallel pipelined Multipath Delay Commutator (MDC) FFT and IFFT architectures for radix 22 FFT algorithm is presented. Multipliers and adders are the most important data paths in FFT and IFFT architectures. Multipliers occupy high area and consume more power. In order to optimize the area and power overhead, modified scaling-free CORDIC based complex multiplier is utilized in the proposed design. In general twiddle factor values are stored in RAM block. In the proposed work, modified scaling-free CORDIC based twiddle factor generator unit is used to generate the twiddle factor and efficient switching units are used. In addition to this, four point FFT operations are performed without complex multiplication which helps to reduce area and power in the last two stages of the pipelined architectures. The design proposed in this paper is based on multipath delay commutator method. The proposed design can be extended to any radix 2n based FFT/IFFT algorithm to improve the throughput. The work is synthesized using Synopsys design Compiler using TSMC 90-nm library. The proposed method proves to be better compared to the reference design in terms of area, throughput and power consumption. The comparative analysis of the proposed design with Xilinx FPGA platform is also discussed in the paper.Keywords: Coordinate Rotational Digital Computer(CORDIC), Complex multiplier, Fast Fourier transform (FFT), Inverse fast Fourier transform (IFFT), Multipath delay Commutator (MDC), modified scaling free CORDIC, complex multiplier, pipelining, parallel processing, radix-2^2.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18211077 Some Computational Results on MPI Parallel Implementation of Dense Simplex Method
Authors: El-Said Badr, Mahmoud Moussa, Konstantinos Paparrizos, Nikolaos Samaras, Angelo Sifaleras
Abstract:
There are two major variants of the Simplex Algorithm: the revised method and the standard, or tableau method. Today, all serious implementations are based on the revised method because it is more efficient for sparse linear programming problems. Moreover, there are a number of applications that lead to dense linear problems so our aim in this paper is to present some computational results on parallel implementation of dense Simplex Method. Our implementation is implemented on a SMP cluster using C programming language and the Message Passing Interface MPI. Preliminary computational results on randomly generated dense linear programs support our results.Keywords: Linear Programming, MPI, Parallel Implementation, Simplex Algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20501076 Fuzzy Logic Based Active Vibration Control of Piezoelectric Stewart Platform
Authors: Arian Bahrami, Mojtaba Tafaoli-Masoule, Mansour Nikkhah Bahrami
Abstract:
This paper demonstrates the potential of applying PD-like fuzzy logic controller for active vibration control of piezoelectric Stewart platforms. Through simulation, the control authority of the piezo stack actuators for effectively damping the Stewart platform vibration can be evaluated for further implementation of the system. Each leg of the piezoelectric Stewart platform consists of a linear piezo stack actuator, a collocated velocity sensor, a collocated displacement sensor and flexible tips for the connections with the two end plates. The piezoelectric stack is modeled as a bar element and the electro-mechanical coupling property is simulated using Matlab/Simulink software. Then, the open loop and closed loop dynamic responses are performed for the system to characterize the effect of the control on the vibration of the piezoelectric Stewart platform. A significant improvement in the damping of the structure can be observed by using the PD-like fuzzy controller.
Keywords: Active vibration control, Fuzzy controller, Piezoelectric stewart platform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28981075 Using the PGAS Programming Paradigm for Biological Sequence Alignment on a Chip Multi-Threading Architecture
Authors: M. Bakhouya, S. A. Bahra, T. El-Ghazawi
Abstract:
The Partitioned Global Address Space (PGAS) programming paradigm offers ease-of-use in expressing parallelism through a global shared address space while emphasizing performance by providing locality awareness through the partitioning of this address space. Therefore, the interest in PGAS programming languages is growing and many new languages have emerged and are becoming ubiquitously available on nearly all modern parallel architectures. Recently, new parallel machines with multiple cores are designed for targeting high performance applications. Most of the efforts have gone into benchmarking but there are a few examples of real high performance applications running on multicore machines. In this paper, we present and evaluate a parallelization technique for implementing a local DNA sequence alignment algorithm using a PGAS based language, UPC (Unified Parallel C) on a chip multithreading architecture, the UltraSPARC T1.Keywords: Partitioned Global Address Space, Unified Parallel C, Multicore machines, Multi-threading Architecture, Sequence alignment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13911074 Simulation with Uncertainties of Active Controlled Vibration Isolation System for Astronaut’s Exercise Platform
Authors: Shield B. Lin, Ziraguen O. Williams
Abstract:
In a task to assist NASA in analyzing the dynamic forces caused by operational countermeasures of an astronaut’s exercise platform impacting the spacecraft, an active proportional-integral-derivative controller commanding a linear actuator is proposed in a vibration isolation system to regulate the movement of the exercise platform. Computer simulation shows promising results that most exciter forces can be reduced or even eliminated. This paper emphasizes on parameter uncertainties, variations and exciter force variations. Drift and variations of system parameters in the vibration isolation system for astronaut’s exercise platform are analyzed. An active controlled scheme is applied with the goals to reduce the platform displacement and to minimize the force being transmitted to the spacecraft structure. The controller must be robust enough to accommodate the wide variations of system parameters and exciter forces. Computer simulation for the vibration isolation system was performed via MATLAB/Simulink and Trick. The simulation results demonstrate the achievement of force reduction with small platform displacement under wide ranges of variations in system parameters.
Keywords: control, counterweight, isolation, vibration
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4411073 Fault Diagnosis of Nonlinear Systems Using Dynamic Neural Networks
Authors: E. Sobhani-Tehrani, K. Khorasani, N. Meskin
Abstract:
This paper presents a novel integrated hybrid approach for fault diagnosis (FD) of nonlinear systems. Unlike most FD techniques, the proposed solution simultaneously accomplishes fault detection, isolation, and identification (FDII) within a unified diagnostic module. At the core of this solution is a bank of adaptive neural parameter estimators (NPE) associated with a set of singleparameter fault models. The NPEs continuously estimate unknown fault parameters (FP) that are indicators of faults in the system. Two NPE structures including series-parallel and parallel are developed with their exclusive set of desirable attributes. The parallel scheme is extremely robust to measurement noise and possesses a simpler, yet more solid, fault isolation logic. On the contrary, the series-parallel scheme displays short FD delays and is robust to closed-loop system transients due to changes in control commands. Finally, a fault tolerant observer (FTO) is designed to extend the capability of the NPEs to systems with partial-state measurement.
Keywords: Hybrid fault diagnosis, Dynamic neural networks, Nonlinear systems.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22231072 The Strategy of Creating a Virtual Interactive Platform for the Low-Carbon Open Innovations Relay
Authors: Mykola S. Shestavin
Abstract:
A strategy for the creation of a Virtual Interactive Platform (or Networking Platform) to combine the four web-baseness of expert systems on the transfer and diffusion of low-carbon technologies. It used the concept of “Open Innovation” and “Triple Helix” with regard to theories of “Green Growth” and “Carbon Footprint”. Interpreters expert systems operate on the basis of models of the “Predator-Prey” for the process of transfer and diffusion of technologies, taking into account the features caused by the need to mitigate the effects of climate change.
Keywords: Climate Change, Expert Systems, Low-Carbon Technology, Open Innovation, Virtual Interactive Platform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18931071 Parallel Particle Swarm Optimization Optimized LDI Controller with Lyapunov Stability Criterion for Nonlinear Structural Systems
Authors: P.-W. Tsai, W.-L. Hong, C.-W. Chen, C.-Y. Chen
Abstract:
In this paper, we present a neural-network (NN) based approach to represent a nonlinear Tagagi-Sugeno (T-S) system. A linear differential inclusion (LDI) state-space representation is utilized to deal with the NN models. Taking advantage of the LDI representation, the stability conditions and controller design are derived for a class of nonlinear structural systems. Moreover, the concept of utilizing the Parallel Particle Swarm Optimization (PPSO) algorithm to solve the common P matrix under the stability criteria is given in this paper.
Keywords: Lyapunov Stability, Parallel Particle Swarm Optimization, Linear Differential Inclusion, Artificial Intelligence.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18651070 Simulation Modeling of Manufacturing Systems for the Serial Route and the Parallel One
Authors: Tadeusz Witkowski, Paweł Antczak, Arkadiusz Antczak
Abstract:
In the paper we discuss the influence of the route flexibility degree, the open rate of operations and the production type coefficient on makespan. The flexible job-open shop scheduling problem FJOSP (an extension of the classical job shop scheduling) is analyzed. For the analysis of the production process we used a hybrid heuristic of the GRASP (greedy randomized adaptive search procedure) with simulated annealing algorithm. Experiments with different levels of factors have been considered and compared. The GRASP+SA algorithm has been tested and illustrated with results for the serial route and the parallel one.Keywords: Makespan, open shop, route flexibility, serial and parallel route, simulation modeling, type of production.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18151069 Performance Evaluation of Parallel Surface Modeling and Generation on Actual and Virtual Multicore Systems
Authors: Nyeng P. Gyang
Abstract:
Even though past, current and future trends suggest that multicore and cloud computing systems are increasingly prevalent/ubiquitous, this class of parallel systems is nonetheless underutilized, in general, and barely used for research on employing parallel Delaunay triangulation for parallel surface modeling and generation, in particular. The performances, of actual/physical and virtual/cloud multicore systems/machines, at executing various algorithms, which implement various parallelization strategies of the incremental insertion technique of the Delaunay triangulation algorithm, were evaluated. T-tests were run on the data collected, in order to determine whether various performance metrics differences (including execution time, speedup and efficiency) were statistically significant. Results show that the actual machine is approximately twice faster than the virtual machine at executing the same programs for the various parallelization strategies. Results, which furnish the scalability behaviors of the various parallelization strategies, also show that some of the differences between the performances of these systems, during different runs of the algorithms on the systems, were statistically significant. A few pseudo superlinear speedup results, which were computed from the raw data collected, are not true superlinear speedup values. These pseudo superlinear speedup values, which arise as a result of one way of computing speedups, disappear and give way to asymmetric speedups, which are the accurate kind of speedups that occur in the experiments performed.Keywords: Cloud computing systems, multicore systems, parallel delaunay triangulation, parallel surface modeling and generation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8801068 Series-Parallel Systems Reliability Optimization Using Genetic Algorithm and Statistical Analysis
Authors: Essa Abrahim Abdulgader Saleem, Thien-My Dao
Abstract:
The main objective of this paper is to optimize series-parallel system reliability using Genetic Algorithm (GA) and statistical analysis; considering system reliability constraints which involve the redundant numbers of selected components, total cost, and total weight. To perform this work, firstly the mathematical model which maximizes system reliability subject to maximum system cost and maximum system weight constraints is presented; secondly, a statistical analysis is used to optimize GA parameters, and thirdly GA is used to optimize series-parallel systems reliability. The objective is to determine the strategy choosing the redundancy level for each subsystem to maximize the overall system reliability subject to total cost and total weight constraints. Finally, the series-parallel system case study reliability optimization results are showed, and comparisons with the other previous results are presented to demonstrate the performance of our GA.
Keywords: Genetic algorithm, optimization, reliability, statistical analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11561067 Performance Improvement of Moving Object Recognition and Tracking Algorithm using Parallel Processing of SURF and Optical Flow
Authors: Jungho Choi, Youngwan Cho
Abstract:
The paper proposes a way of parallel processing of SURF and Optical Flow for moving object recognition and tracking. The object recognition and tracking is one of the most important task in computer vision, however disadvantage are many operations cause processing speed slower so that it can-t do real-time object recognition and tracking. The proposed method uses a typical way of feature extraction SURF and moving object Optical Flow for reduce disadvantage and real-time moving object recognition and tracking, and parallel processing techniques for speed improvement. First analyse that an image from DB and acquired through the camera using SURF for compared to the same object recognition then set ROI (Region of Interest) for tracking movement of feature points using Optical Flow. Secondly, using Multi-Thread is for improved processing speed and recognition by parallel processing. Finally, performance is evaluated and verified efficiency of algorithm throughout the experiment.Keywords: moving object recognition, moving object tracking, SURF, Optical Flow, Multi-Thread.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26451066 A Parallel Algorithm for 2-D Cylindrical Geometry Transport Equation with Interface Corrections
Authors: Wei Jun-xia, Yuan Guang-wei, Yang Shu-lin, Shen Wei-dong
Abstract:
In order to make conventional implicit algorithm to be applicable in large scale parallel computers , an interface prediction and correction of discontinuous finite element method is presented to solve time-dependent neutron transport equations under 2-D cylindrical geometry. Domain decomposition is adopted in the computational domain.The numerical experiments show that our parallel algorithm with explicit prediction and implicit correction has good precision, parallelism and simplicity. Especially, it can reach perfect speedup even on hundreds of processors for large-scale problems.
Keywords: Transport Equation, Discontinuous Finite Element, Domain Decomposition, Interface Prediction And Correction
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16651065 Modified Techniques for Distribution System Reliability Improvement by Parallel Operation of Transformers
Authors: Ohn Zin Lin, Okka, Cho Cho Myint
Abstract:
It is important to consider the effects of transformers on distribution system because they have the highest impact on system reliability. It is generally said that parallel operation of transformers (POT) can improve the system reliability. However, the estimation approach can be also considered for accuracy. In this paper, we propose a three-state components model and equations to determine the reliability improvement by POT, and cooperation of POT and distributed generation (DG). Based on the proposed model and techniques, the effect of POT is analyzed in four different tests with the consideration of conventional distribution system, distribution automation system (DAS) and DG. According to the results, the reliability is greatly improved by cooperation of POT, DAS and DG. The proposed model and methods are applicable to not only developing countries which have conventional distribution system but also developed countries in which DAS has already installed.
Keywords: Distribution system, reliability, dispersed generator, energy not supply, transformer parallel operation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7041064 Edit Distance Algorithm to Increase Storage Efficiency of Javanese Corpora
Authors: Aji P. Wibawa, Andrew Nafalski, Neil Murray, Wayan F. Mahmudy
Abstract:
Since the one-to-one word translator does not have the facility to translate pragmatic aspects of Javanese, the parallel text alignment model described uses a phrase pair combination. The algorithm aligns the parallel text automatically from the beginning to the end of each sentence. Even though the results of the phrase pair combination outperform the previous algorithm, it is still inefficient. Recording all possible combinations consume more space in the database and time consuming. The original algorithm is modified by applying the edit distance coefficient to improve the data-storage efficiency. As a result, the data-storage consumption is 90% reduced as well as its learning period (42s).Keywords: edit distance coefficient, Javanese, parallel text alignment, phrase pair combination
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17281063 Some Characteristics of Systolic Arrays
Authors: Halil Snopce, Ilir Spahiu
Abstract:
In this paper is investigated a possible optimization of some linear algebra problems which can be solved by parallel processing using the special arrays called systolic arrays. In this paper are used some special types of transformations for the designing of these arrays. We show the characteristics of these arrays. The main focus is on discussing the advantages of these arrays in parallel computation of matrix product, with special approach to the designing of systolic array for matrix multiplication. Multiplication of large matrices requires a lot of computational time and its complexity is O(n3 ). There are developed many algorithms (both sequential and parallel) with the purpose of minimizing the time of calculations. Systolic arrays are good suited for this purpose. In this paper we show that using an appropriate transformation implicates in finding more optimal arrays for doing the calculations of this type.Keywords: Data dependences, matrix multiplication, systolicarray, transformation matrix.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15221062 Detection and Classification of Faults on Parallel Transmission Lines Using Wavelet Transform and Neural Network
Authors: V.S.Kale, S.R.Bhide, P.P.Bedekar, G.V.K.Mohan
Abstract:
The protection of parallel transmission lines has been a challenging task due to mutual coupling between the adjacent circuits of the line. This paper presents a novel scheme for detection and classification of faults on parallel transmission lines. The proposed approach uses combination of wavelet transform and neural network, to solve the problem. While wavelet transform is a powerful mathematical tool which can be employed as a fast and very effective means of analyzing power system transient signals, artificial neural network has a ability to classify non-linear relationship between measured signals by identifying different patterns of the associated signals. The proposed algorithm consists of time-frequency analysis of fault generated transients using wavelet transform, followed by pattern recognition using artificial neural network to identify the type of the fault. MATLAB/Simulink is used to generate fault signals and verify the correctness of the algorithm. The adaptive discrimination scheme is tested by simulating different types of fault and varying fault resistance, fault location and fault inception time, on a given power system model. The simulation results show that the proposed scheme for fault diagnosis is able to classify all the faults on the parallel transmission line rapidly and correctly.
Keywords: Artificial neural network, fault detection and classification, parallel transmission lines, wavelet transform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30121061 On the Efficient Implementation of a Serial and Parallel Decomposition Algorithm for Fast Support Vector Machine Training Including a Multi-Parameter Kernel
Authors: Tatjana Eitrich, Bruno Lang
Abstract:
This work deals with aspects of support vector machine learning for large-scale data mining tasks. Based on a decomposition algorithm for support vector machine training that can be run in serial as well as shared memory parallel mode we introduce a transformation of the training data that allows for the usage of an expensive generalized kernel without additional costs. We present experiments for the Gaussian kernel, but usage of other kernel functions is possible, too. In order to further speed up the decomposition algorithm we analyze the critical problem of working set selection for large training data sets. In addition, we analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our tests and conclusions led to several modifications of the algorithm and the improvement of overall support vector machine learning performance. Our method allows for using extensive parameter search methods to optimize classification accuracy.
Keywords: Support Vector Machine Training, Multi-ParameterKernels, Shared Memory Parallel Computing, Large Data
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14431060 Comparative Evaluation of Adaptive and Conventional Distance Relay for Parallel Transmission Line with Mutual Coupling
Authors: S.G. Srivani, Chandrasekhar Reddy Atla, K.P.Vittal
Abstract:
This paper presents the development of adaptive distance relay for protection of parallel transmission line with mutual coupling. The proposed adaptive relay, automatically adjusts its operation based on the acquisition of the data from distance relay of adjacent line and status of adjacent line from line circuit breaker IED (Intelligent Electronic Device). The zero sequence current of the adjacent parallel transmission line is used to compute zero sequence current ratio and the mutual coupling effect is fully compensated. The relay adapts to changing circumstances, like failure in communication from other relays and non - availability of adjacent transmission line. The performance of the proposed adaptive relay is tested using steady state and dynamic test procedures. The fault transients are obtained by simulating a realistic parallel transmission line system with mutual coupling effect in PSCAD. The evaluation test results show the efficacy of adaptive distance relay over the conventional distance relay.Keywords: Adaptive relaying, distance measurement, mutualcoupling, quadrilateral trip characteristic, zones of protection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 31451059 Design and Implementation of a Software Platform Based on Artificial Intelligence for Product Recommendation
Authors: G. Settanni, A. Panarese, R. Vaira, A. Galiano
Abstract:
Nowadays, artificial intelligence is used successfully in the field of e-commerce for its ability to learn from a large amount of data. In this research study, a prototype software platform was designed and implemented in order to suggest to users the most suitable products for their needs. The platform includes a recommender system based on artificial intelligence algorithms that provide suggestions and decision support to the customer. Specifically, support vector machine algorithms have been implemented combined with natural language processing techniques that allow the user to interact with the system, express their requests and receive suggestions. The interested user can access the web platform on the internet using a computer, tablet or mobile phone, register, provide the necessary information and view the products that the system deems them the most appropriate. The platform also integrates a dashboard that allows the use of the various functions, which the platform is equipped with, in an intuitive and simple way. Also, Long Short-Term Memory algorithms have been implemented and trained on historical data in order to predict customer scores of the different items. Items with the highest scores are recommended to customers.
Keywords: Deep Learning, Long Short-Term Memory, Machine Learning, Recommender Systems, Support Vector Machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3281058 Parallel Explicit Group Domain Decomposition Methods for the Telegraph Equation
Authors: Kew Lee Ming, Norhashidah Hj. Mohd. Ali
Abstract:
In a previous work, we presented the numerical solution of the two dimensional second order telegraph partial differential equation discretized by the centred and rotated five-point finite difference discretizations, namely the explicit group (EG) and explicit decoupled group (EDG) iterative methods, respectively. In this paper, we utilize a domain decomposition algorithm on these group schemes to divide the tasks involved in solving the same equation. The objective of this study is to describe the development of the parallel group iterative schemes under OpenMP programming environment as a way to reduce the computational costs of the solution processes using multicore technologies. A detailed performance analysis of the parallel implementations of points and group iterative schemes will be reported and discussed.Keywords: Telegraph equation, explicit group iterative scheme, domain decomposition algorithm, parallelization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15261057 Low Power and Less Area Architecture for Integer Motion Estimation
Authors: C Hisham, K Komal, Amit K Mishra
Abstract:
Full search block matching algorithm is widely used for hardware implementation of motion estimators in video compression algorithms. In this paper we are proposing a new architecture, which consists of a 2D parallel processing unit and a 1D unit both working in parallel. The proposed architecture reduces both data access power and computational power which are the main causes of power consumption in integer motion estimation. It also completes the operations with nearly the same number of clock cycles as compared to a 2D systolic array architecture. In this work sum of absolute difference (SAD)-the most repeated operation in block matching, is calculated in two steps. The first step is to calculate the SAD for alternate rows by a 2D parallel unit. If the SAD calculated by the parallel unit is less than the stored minimum SAD, the SAD of the remaining rows is calculated by the 1D unit. Early termination, which stops avoidable computations has been achieved with the help of alternate rows method proposed in this paper and by finding a low initial SAD value based on motion vector prediction. Data reuse has been applied to the reference blocks in the same search area which significantly reduced the memory access.
Keywords: Sum of absolute difference, high speed DSP.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1493