Search results for: Parallel%20Computing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 571

Search results for: Parallel%20Computing

511 Minimizing Makespan Subject to Budget Limitation in Parallel Flow Shop

Authors: Amin Sahraeian

Abstract:

One of the criteria in production scheduling is Make Span, minimizing this criteria causes more efficiently use of the resources specially machinery and manpower. By assigning some budget to some of the operations the operation time of these activities reduces and affects the total completion time of all the operations (Make Span). In this paper this issue is practiced in parallel flow shops. At first we convert parallel flow shop to a network model and by using a linear programming approach it is identified in order to minimize make span (the completion time of the network) which activities (operations) are better to absorb the predetermined and limited budget. Minimizing the total completion time of all the activities in the network is equivalent to minimizing make span in production scheduling.

Keywords: parallel flow shop, make span, linear programming, budget

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1778
510 Design and Implementation of Shared Memory based Parallel File System Logging Method for High Performance Computing

Authors: Hyeyoung Cho, Sungho Kim, SangDong Lee

Abstract:

I/O workload is a critical and important factor to analyze I/O pattern and file system performance. However tracing I/O operations on the fly distributed parallel file system is non-trivial due to collection overhead and a large volume of data. In this paper, we design and implement a parallel file system logging method for high performance computing using shared memory-based multi-layer scheme. It minimizes the overhead with reduced logging operation response time and provides efficient post-processing scheme through shared memory. Separated logging server can collect sequential logs from multiple clients in a cluster through packet communication. Implementation and evaluation result shows low overhead and high scalability of this architecture for high performance parallel logging analysis.

Keywords: I/O workload, PVFS, I/O Trace.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1559
509 Novel Ridge Orientation Based Approach for Fingerprint Identification Using Co-Occurrence Matrix

Authors: Mehran Yazdi, Zahra Adelpour, Batoul Bahraini, Yasaman Keshtkar Jahromi

Abstract:

In this paper we use the property of co-occurrence matrix in finding parallel lines in binary pictures for fingerprint identification. In our proposed algorithm, we reduce the noise by filtering the fingerprint images and then transfer the fingerprint images to binary images using a proper threshold. Next, we divide the binary images into some regions having parallel lines in the same direction. The lines in each region have a specific angle that can be used for comparison. This method is simple, performs the comparison step quickly and has a good resistance in the presence of the noise.

Keywords: Parallel lines detection, co-occurrence matrix, fingerprint identification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1356
508 Parallel Vector Processing Using Multi Level Orbital DATA

Authors: Nagi Mekhiel

Abstract:

Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.

Keywords: Memory organization, parallel processors, serial code, vector processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1061
507 Analyzing the Factors that Cause Parallel Performance Degradation in Parallel Graph-Based Computations Using Graph500

Authors: Mustafa Elfituri, Jonathan Cook

Abstract:

Recently, graph-based computations have become more important in large-scale scientific computing as they can provide a methodology to model many types of relations between independent objects. They are being actively used in fields as varied as biology, social networks, cybersecurity, and computer networks. At the same time, graph problems have some properties such as irregularity and poor locality that make their performance different than regular applications performance. Therefore, parallelizing graph algorithms is a hard and challenging task. Initial evidence is that standard computer architectures do not perform very well on graph algorithms. Little is known exactly what causes this. The Graph500 benchmark is a representative application for parallel graph-based computations, which have highly irregular data access and are driven more by traversing connected data than by computation. In this paper, we present results from analyzing the performance of various example implementations of Graph500, including a shared memory (OpenMP) version, a distributed (MPI) version, and a hybrid version. We measured and analyzed all the factors that affect its performance in order to identify possible changes that would improve its performance. Results are discussed in relation to what factors contribute to performance degradation.

Keywords: Graph computation, Graph500 benchmark, parallel architectures, parallel programming, workload characterization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 546
506 A Survey on Performance Tools for OpenMP

Authors: Mubarak S. Mohsen, Rosni Abdullah, Yong M. Teo

Abstract:

Advances in processors architecture, such as multicore, increase the size of complexity of parallel computer systems. With multi-core architecture there are different parallel languages that can be used to run parallel programs. One of these languages is OpenMP which embedded in C/Cµ or FORTRAN. Because of this new architecture and the complexity, it is very important to evaluate the performance of OpenMP constructs, kernels, and application program on multi-core systems. Performance is the activity of collecting the information about the execution characteristics of a program. Performance tools consists of at least three interfacing software layers, including instrumentation, measurement, and analysis. The instrumentation layer defines the measured performance events. The measurement layer determines what performance event is actually captured and how it is measured by the tool. The analysis layer processes the performance data and summarizes it into a form that can be displayed in performance tools. In this paper, a number of OpenMP performance tools are surveyed, explaining how each is used to collect, analyse, and display data collection.

Keywords: Parallel performance tools, OpenMP, multi-core.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1921
505 Isotropic Stress Distribution in Cu/(001) Fe Two Sheets

Authors: A. Derardja, L. Baroura, M. Brioua

Abstract:

The nanotechnology based on epitaxial systems includes single or arranged misfit dislocations. In general, whatever is the type of dislocation or the geometry of the array formed by the dislocations; it is important for experimental studies to know exactly the stress distribution for which there is no analytical expression [1, 2]. This work, using a numerical analysis, deals with relaxation of epitaxial layers having at their interface a periodic network of edge misfit dislocations. The stress distribution is estimated by using isotropic elasticity. The results show that the thickness of the two sheets is a crucial parameter in the stress distributions and then in the profile of the two sheets. A comparative study between the case of single dislocation and the case of parallel network shows that the layers relaxed better when the interface is covered by a parallel arrangement of misfit. Consequently, a single dislocation at the interface produces an important stress field which can be reduced by inserting a parallel network of dislocations with suitable periodicity.

Keywords: Parallel array of misfit, interface, isotropic elasticity, single crystalline substrates, coherent interface

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1569
504 On Fault Diagnosis of Asynchronous Sequential Machines with Parallel Composition

Authors: Jung-Min Yang

Abstract:

Fault diagnosis of composite asynchronous sequential machines with parallel composition is addressed in this paper. An adversarial input can infiltrate one of two submachines comprising the composite asynchronous machine, causing an unauthorized state transition. The objective is to characterize the condition under which the controller can diagnose any fault occurrence. Two control configurations, state feedback and output feedback, are considered in this paper. In the case of output feedback, the exact estimation of the state is impossible since the current state is inaccessible and the output feedback is given as the form of burst. A simple example is provided to demonstrate the proposed methodology.

Keywords: Asynchronous sequential machines, parallel composition, fault diagnosis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 969
503 Conditions for Fault Recovery of Interconnected Asynchronous Sequential Machines with State Feedback

Authors: Jung–Min Yang

Abstract:

In this paper, fault recovery for parallel interconnected asynchronous sequential machines is studied. An adversarial input can infiltrate into one of two submachines comprising parallel composition of the considered asynchronous sequential machine, causing an unauthorized state transition. The control objective is to elucidate the condition for the existence of a corrective controller that makes the closed-loop system immune against any occurrence of adversarial inputs. In particular, an efficient existence condition is presented that does not need the complete modeling of the interconnected asynchronous sequential machine.

Keywords: Asynchronous sequential machines, parallel composition, corrective control, fault tolerance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 838
502 Hybrid Prefix Adder Architecture for Minimizing the Power Delay Product

Authors: P.Ramanathan, P.T.Vanathi

Abstract:

Parallel Prefix addition is a technique for improving the speed of binary addition. Due to continuing integrating intensity and the growing needs of portable devices, low-power and highperformance designs are of prime importance. The classical parallel prefix adder structures presented in the literature over the years optimize for logic depth, area, fan-out and interconnect count of logic circuits. In this paper, a new architecture for performing 8-bit, 16-bit and 32-bit Parallel Prefix addition is proposed. The proposed prefix adder structures is compared with several classical adders of same bit width in terms of power, delay and number of computational nodes. The results reveal that the proposed structures have the least power delay product when compared with its peer existing Prefix adder structures. Tanner EDA tool was used for simulating the adder designs in the TSMC 180 nm and TSMC 130 nm technologies.

Keywords: Parallel Prefix Adder (PPA), Dot operator, Semi-Dotoperator, Complementary Metal Oxide Semiconductor (CMOS), Odd-dot operator, Even-dot operator, Odd-semi-dot operator andEven-semi-dot operator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1724
501 Some Computational Results on MPI Parallel Implementation of Dense Simplex Method

Authors: El-Said Badr, Mahmoud Moussa, Konstantinos Paparrizos, Nikolaos Samaras, Angelo Sifaleras

Abstract:

There are two major variants of the Simplex Algorithm: the revised method and the standard, or tableau method. Today, all serious implementations are based on the revised method because it is more efficient for sparse linear programming problems. Moreover, there are a number of applications that lead to dense linear problems so our aim in this paper is to present some computational results on parallel implementation of dense Simplex Method. Our implementation is implemented on a SMP cluster using C programming language and the Message Passing Interface MPI. Preliminary computational results on randomly generated dense linear programs support our results.

Keywords: Linear Programming, MPI, Parallel Implementation, Simplex Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2048
500 Using the PGAS Programming Paradigm for Biological Sequence Alignment on a Chip Multi-Threading Architecture

Authors: M. Bakhouya, S. A. Bahra, T. El-Ghazawi

Abstract:

The Partitioned Global Address Space (PGAS) programming paradigm offers ease-of-use in expressing parallelism through a global shared address space while emphasizing performance by providing locality awareness through the partitioning of this address space. Therefore, the interest in PGAS programming languages is growing and many new languages have emerged and are becoming ubiquitously available on nearly all modern parallel architectures. Recently, new parallel machines with multiple cores are designed for targeting high performance applications. Most of the efforts have gone into benchmarking but there are a few examples of real high performance applications running on multicore machines. In this paper, we present and evaluate a parallelization technique for implementing a local DNA sequence alignment algorithm using a PGAS based language, UPC (Unified Parallel C) on a chip multithreading architecture, the UltraSPARC T1.

Keywords: Partitioned Global Address Space, Unified Parallel C, Multicore machines, Multi-threading Architecture, Sequence alignment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1389
499 Fault Diagnosis of Nonlinear Systems Using Dynamic Neural Networks

Authors: E. Sobhani-Tehrani, K. Khorasani, N. Meskin

Abstract:

This paper presents a novel integrated hybrid approach for fault diagnosis (FD) of nonlinear systems. Unlike most FD techniques, the proposed solution simultaneously accomplishes fault detection, isolation, and identification (FDII) within a unified diagnostic module. At the core of this solution is a bank of adaptive neural parameter estimators (NPE) associated with a set of singleparameter fault models. The NPEs continuously estimate unknown fault parameters (FP) that are indicators of faults in the system. Two NPE structures including series-parallel and parallel are developed with their exclusive set of desirable attributes. The parallel scheme is extremely robust to measurement noise and possesses a simpler, yet more solid, fault isolation logic. On the contrary, the series-parallel scheme displays short FD delays and is robust to closed-loop system transients due to changes in control commands. Finally, a fault tolerant observer (FTO) is designed to extend the capability of the NPEs to systems with partial-state measurement.

Keywords: Hybrid fault diagnosis, Dynamic neural networks, Nonlinear systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2220
498 Parallel Particle Swarm Optimization Optimized LDI Controller with Lyapunov Stability Criterion for Nonlinear Structural Systems

Authors: P.-W. Tsai, W.-L. Hong, C.-W. Chen, C.-Y. Chen

Abstract:

In this paper, we present a neural-network (NN) based approach to represent a nonlinear Tagagi-Sugeno (T-S) system. A linear differential inclusion (LDI) state-space representation is utilized to deal with the NN models. Taking advantage of the LDI representation, the stability conditions and controller design are derived for a class of nonlinear structural systems. Moreover, the concept of utilizing the Parallel Particle Swarm Optimization (PPSO) algorithm to solve the common P matrix under the stability criteria is given in this paper.

Keywords: Lyapunov Stability, Parallel Particle Swarm Optimization, Linear Differential Inclusion, Artificial Intelligence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1864
497 Simulation Modeling of Manufacturing Systems for the Serial Route and the Parallel One

Authors: Tadeusz Witkowski, Paweł Antczak, Arkadiusz Antczak

Abstract:

In the paper we discuss the influence of the route flexibility degree, the open rate of operations and the production type coefficient on makespan. The flexible job-open shop scheduling problem FJOSP (an extension of the classical job shop scheduling) is analyzed. For the analysis of the production process we used a hybrid heuristic of the GRASP (greedy randomized adaptive search procedure) with simulated annealing algorithm. Experiments with different levels of factors have been considered and compared. The GRASP+SA algorithm has been tested and illustrated with results for the serial route and the parallel one.

Keywords: Makespan, open shop, route flexibility, serial and parallel route, simulation modeling, type of production.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1814
496 Performance Evaluation of Parallel Surface Modeling and Generation on Actual and Virtual Multicore Systems

Authors: Nyeng P. Gyang

Abstract:

Even though past, current and future trends suggest that multicore and cloud computing systems are increasingly prevalent/ubiquitous, this class of parallel systems is nonetheless underutilized, in general, and barely used for research on employing parallel Delaunay triangulation for parallel surface modeling and generation, in particular. The performances, of actual/physical and virtual/cloud multicore systems/machines, at executing various algorithms, which implement various parallelization strategies of the incremental insertion technique of the Delaunay triangulation algorithm, were evaluated. T-tests were run on the data collected, in order to determine whether various performance metrics differences (including execution time, speedup and efficiency) were statistically significant. Results show that the actual machine is approximately twice faster than the virtual machine at executing the same programs for the various parallelization strategies. Results, which furnish the scalability behaviors of the various parallelization strategies, also show that some of the differences between the performances of these systems, during different runs of the algorithms on the systems, were statistically significant. A few pseudo superlinear speedup results, which were computed from the raw data collected, are not true superlinear speedup values. These pseudo superlinear speedup values, which arise as a result of one way of computing speedups, disappear and give way to asymmetric speedups, which are the accurate kind of speedups that occur in the experiments performed.

Keywords: Cloud computing systems, multicore systems, parallel delaunay triangulation, parallel surface modeling and generation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 878
495 Series-Parallel Systems Reliability Optimization Using Genetic Algorithm and Statistical Analysis

Authors: Essa Abrahim Abdulgader Saleem, Thien-My Dao

Abstract:

The main objective of this paper is to optimize series-parallel system reliability using Genetic Algorithm (GA) and statistical analysis; considering system reliability constraints which involve the redundant numbers of selected components, total cost, and total weight. To perform this work, firstly the mathematical model which maximizes system reliability subject to maximum system cost and maximum system weight constraints is presented; secondly, a statistical analysis is used to optimize GA parameters, and thirdly GA is used to optimize series-parallel systems reliability. The objective is to determine the strategy choosing the redundancy level for each subsystem to maximize the overall system reliability subject to total cost and total weight constraints. Finally, the series-parallel system case study reliability optimization results are showed, and comparisons with the other previous results are presented to demonstrate the performance of our GA.

Keywords: Genetic algorithm, optimization, reliability, statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1155
494 Performance Improvement of Moving Object Recognition and Tracking Algorithm using Parallel Processing of SURF and Optical Flow

Authors: Jungho Choi, Youngwan Cho

Abstract:

The paper proposes a way of parallel processing of SURF and Optical Flow for moving object recognition and tracking. The object recognition and tracking is one of the most important task in computer vision, however disadvantage are many operations cause processing speed slower so that it can-t do real-time object recognition and tracking. The proposed method uses a typical way of feature extraction SURF and moving object Optical Flow for reduce disadvantage and real-time moving object recognition and tracking, and parallel processing techniques for speed improvement. First analyse that an image from DB and acquired through the camera using SURF for compared to the same object recognition then set ROI (Region of Interest) for tracking movement of feature points using Optical Flow. Secondly, using Multi-Thread is for improved processing speed and recognition by parallel processing. Finally, performance is evaluated and verified efficiency of algorithm throughout the experiment.

Keywords: moving object recognition, moving object tracking, SURF, Optical Flow, Multi-Thread.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2643
493 A Parallel Algorithm for 2-D Cylindrical Geometry Transport Equation with Interface Corrections

Authors: Wei Jun-xia, Yuan Guang-wei, Yang Shu-lin, Shen Wei-dong

Abstract:

In order to make conventional implicit algorithm to be applicable in large scale parallel computers , an interface prediction and correction of discontinuous finite element method is presented to solve time-dependent neutron transport equations under 2-D cylindrical geometry. Domain decomposition is adopted in the computational domain.The numerical experiments show that our parallel algorithm with explicit prediction and implicit correction has good precision, parallelism and simplicity. Especially, it can reach perfect speedup even on hundreds of processors for large-scale problems.

Keywords: Transport Equation, Discontinuous Finite Element, Domain Decomposition, Interface Prediction And Correction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1664
492 Modified Techniques for Distribution System Reliability Improvement by Parallel Operation of Transformers

Authors: Ohn Zin Lin, Okka, Cho Cho Myint

Abstract:

It is important to consider the effects of transformers on distribution system because they have the highest impact on system reliability. It is generally said that parallel operation of transformers (POT) can improve the system reliability. However, the estimation approach can be also considered for accuracy. In this paper, we propose a three-state components model and equations to determine the reliability improvement by POT, and cooperation of POT and distributed generation (DG). Based on the proposed model and techniques, the effect of POT is analyzed in four different tests with the consideration of conventional distribution system, distribution automation system (DAS) and DG. According to the results, the reliability is greatly improved by cooperation of POT, DAS and DG. The proposed model and methods are applicable to not only developing countries which have conventional distribution system but also developed countries in which DAS has already installed.

Keywords: Distribution system, reliability, dispersed generator, energy not supply, transformer parallel operation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 701
491 Edit Distance Algorithm to Increase Storage Efficiency of Javanese Corpora

Authors: Aji P. Wibawa, Andrew Nafalski, Neil Murray, Wayan F. Mahmudy

Abstract:

Since the one-to-one word translator does not have the facility to translate pragmatic aspects of Javanese, the parallel text alignment model described uses a phrase pair combination. The algorithm aligns the parallel text automatically from the beginning to the end of each sentence. Even though the results of the phrase pair combination outperform the previous algorithm, it is still inefficient. Recording all possible combinations consume more space in the database and time consuming. The original algorithm is modified by applying the edit distance coefficient to improve the data-storage efficiency. As a result, the data-storage consumption is 90% reduced as well as its learning period (42s).

Keywords: edit distance coefficient, Javanese, parallel text alignment, phrase pair combination

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1727
490 Some Characteristics of Systolic Arrays

Authors: Halil Snopce, Ilir Spahiu

Abstract:

In this paper is investigated a possible optimization of some linear algebra problems which can be solved by parallel processing using the special arrays called systolic arrays. In this paper are used some special types of transformations for the designing of these arrays. We show the characteristics of these arrays. The main focus is on discussing the advantages of these arrays in parallel computation of matrix product, with special approach to the designing of systolic array for matrix multiplication. Multiplication of large matrices requires a lot of computational time and its complexity is O(n3 ). There are developed many algorithms (both sequential and parallel) with the purpose of minimizing the time of calculations. Systolic arrays are good suited for this purpose. In this paper we show that using an appropriate transformation implicates in finding more optimal arrays for doing the calculations of this type.

Keywords: Data dependences, matrix multiplication, systolicarray, transformation matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1520
489 Detection and Classification of Faults on Parallel Transmission Lines Using Wavelet Transform and Neural Network

Authors: V.S.Kale, S.R.Bhide, P.P.Bedekar, G.V.K.Mohan

Abstract:

The protection of parallel transmission lines has been a challenging task due to mutual coupling between the adjacent circuits of the line. This paper presents a novel scheme for detection and classification of faults on parallel transmission lines. The proposed approach uses combination of wavelet transform and neural network, to solve the problem. While wavelet transform is a powerful mathematical tool which can be employed as a fast and very effective means of analyzing power system transient signals, artificial neural network has a ability to classify non-linear relationship between measured signals by identifying different patterns of the associated signals. The proposed algorithm consists of time-frequency analysis of fault generated transients using wavelet transform, followed by pattern recognition using artificial neural network to identify the type of the fault. MATLAB/Simulink is used to generate fault signals and verify the correctness of the algorithm. The adaptive discrimination scheme is tested by simulating different types of fault and varying fault resistance, fault location and fault inception time, on a given power system model. The simulation results show that the proposed scheme for fault diagnosis is able to classify all the faults on the parallel transmission line rapidly and correctly.

Keywords: Artificial neural network, fault detection and classification, parallel transmission lines, wavelet transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3010
488 On the Efficient Implementation of a Serial and Parallel Decomposition Algorithm for Fast Support Vector Machine Training Including a Multi-Parameter Kernel

Authors: Tatjana Eitrich, Bruno Lang

Abstract:

This work deals with aspects of support vector machine learning for large-scale data mining tasks. Based on a decomposition algorithm for support vector machine training that can be run in serial as well as shared memory parallel mode we introduce a transformation of the training data that allows for the usage of an expensive generalized kernel without additional costs. We present experiments for the Gaussian kernel, but usage of other kernel functions is possible, too. In order to further speed up the decomposition algorithm we analyze the critical problem of working set selection for large training data sets. In addition, we analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our tests and conclusions led to several modifications of the algorithm and the improvement of overall support vector machine learning performance. Our method allows for using extensive parameter search methods to optimize classification accuracy.

Keywords: Support Vector Machine Training, Multi-ParameterKernels, Shared Memory Parallel Computing, Large Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1441
487 Comparative Evaluation of Adaptive and Conventional Distance Relay for Parallel Transmission Line with Mutual Coupling

Authors: S.G. Srivani, Chandrasekhar Reddy Atla, K.P.Vittal

Abstract:

This paper presents the development of adaptive distance relay for protection of parallel transmission line with mutual coupling. The proposed adaptive relay, automatically adjusts its operation based on the acquisition of the data from distance relay of adjacent line and status of adjacent line from line circuit breaker IED (Intelligent Electronic Device). The zero sequence current of the adjacent parallel transmission line is used to compute zero sequence current ratio and the mutual coupling effect is fully compensated. The relay adapts to changing circumstances, like failure in communication from other relays and non - availability of adjacent transmission line. The performance of the proposed adaptive relay is tested using steady state and dynamic test procedures. The fault transients are obtained by simulating a realistic parallel transmission line system with mutual coupling effect in PSCAD. The evaluation test results show the efficacy of adaptive distance relay over the conventional distance relay.

Keywords: Adaptive relaying, distance measurement, mutualcoupling, quadrilateral trip characteristic, zones of protection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3144
486 Parallel Explicit Group Domain Decomposition Methods for the Telegraph Equation

Authors: Kew Lee Ming, Norhashidah Hj. Mohd. Ali

Abstract:

In a previous work, we presented the numerical solution of the two dimensional second order telegraph partial differential equation discretized by the centred and rotated five-point finite difference discretizations, namely the explicit group (EG) and explicit decoupled group (EDG) iterative methods, respectively. In this paper, we utilize a domain decomposition algorithm on these group schemes to divide the tasks involved in solving the same equation. The objective of this study is to describe the development of the parallel group iterative schemes under OpenMP programming environment as a way to reduce the computational costs of the solution processes using multicore technologies. A detailed performance analysis of the parallel implementations of points and group iterative schemes will be reported and discussed.

Keywords: Telegraph equation, explicit group iterative scheme, domain decomposition algorithm, parallelization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1524
485 Low Power and Less Area Architecture for Integer Motion Estimation

Authors: C Hisham, K Komal, Amit K Mishra

Abstract:

Full search block matching algorithm is widely used for hardware implementation of motion estimators in video compression algorithms. In this paper we are proposing a new architecture, which consists of a 2D parallel processing unit and a 1D unit both working in parallel. The proposed architecture reduces both data access power and computational power which are the main causes of power consumption in integer motion estimation. It also completes the operations with nearly the same number of clock cycles as compared to a 2D systolic array architecture. In this work sum of absolute difference (SAD)-the most repeated operation in block matching, is calculated in two steps. The first step is to calculate the SAD for alternate rows by a 2D parallel unit. If the SAD calculated by the parallel unit is less than the stored minimum SAD, the SAD of the remaining rows is calculated by the 1D unit. Early termination, which stops avoidable computations has been achieved with the help of alternate rows method proposed in this paper and by finding a low initial SAD value based on motion vector prediction. Data reuse has been applied to the reference blocks in the same search area which significantly reduced the memory access.

Keywords: Sum of absolute difference, high speed DSP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1489
484 Applying Autonomic Computing Concepts to Parallel Computing using Intelligent Agents

Authors: Blesson Varghese, Gerard T. McKee

Abstract:

The work reported in this paper is motivated by the fact that there is a need to apply autonomic computing concepts to parallel computing systems. Advancing on prior work based on intelligent cores [36], a swarm-array computing approach, this paper focuses on 'Intelligent agents' another swarm-array computing approach in which the task to be executed on a parallel computing core is considered as a swarm of autonomous agents. A task is carried to a computing core by carrier agents and is seamlessly transferred between cores in the event of a predicted failure, thereby achieving self-ware objectives of autonomic computing. The feasibility of the proposed swarm-array computing approach is validated on a multi-agent simulator.

Keywords: Autonomic computing, intelligent agents, swarm-array computing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1592
483 A TFETI Domain Decompositon Solver for Von Mises Elastoplasticity Model with Combination of Linear Isotropic-Kinematic Hardening

Authors: Martin Cermak, Stanislav Sysala

Abstract:

In this paper we present the efficient parallel implementation of elastoplastic problems based on the TFETI (Total Finite Element Tearing and Interconnecting) domain decomposition method. This approach allow us to use parallel solution and compute this nonlinear problem on the supercomputers and decrease the solution time and compute problems with millions of DOFs. In our approach we consider an associated elastoplastic model with the von Mises plastic criterion and the combination of linear isotropic-kinematic hardening law. This model is discretized by the implicit Euler method in time and by the finite element method in space. We consider the system of nonlinear equations with a strongly semismooth and strongly monotone operator. The semismooth Newton method is applied to solve this nonlinear system. Corresponding linearized problems arising in the Newton iterations are solved in parallel by the above mentioned TFETI. The implementation of this problem is realized in our in-house MatSol packages developed in MatLab.

Keywords: Isotropic-kinematic hardening, TFETI, domain decomposition, parallel solution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1758
482 Discrete Breeding Swarm for Cost Minimization of Parallel Job Shop Scheduling Problem

Authors: Tarek Aboueldah, Hanan Farag

Abstract:

Parallel Job Shop Scheduling Problem (JSSP) is a multi-objective and multi constrains NP-optimization problem. Traditional Artificial Intelligence techniques have been widely used; however, they could be trapped into the local minimum without reaching the optimum solution. Thus, we propose a hybrid Artificial Intelligence (AI) model with Discrete Breeding Swarm (DBS) added to traditional AI to avoid this trapping. This model is applied in the cost minimization of the Car Sequencing and Operator Allocation (CSOA) problem. The practical experiment shows that our model outperforms other techniques in cost minimization.

Keywords: Parallel Job Shop Scheduling Problem, Artificial Intelligence, Discrete Breeding Swarm, Car Sequencing and Operator Allocation, cost minimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 609