Search results for: Parallel Implementation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2201

Search results for: Parallel Implementation

2171 Some Results on Parallel Alternating Methods

Authors: Guangbin Wang, Fuping Tan

Abstract:

In this paper, we investigate two parallel alternating methods for solving the system of linear equations Ax = b and give convergence theorems for the parallel alternating methods when the coefficient matrix is a nonsingular H-matrix. Furthermore, we give one example to show our results.

Keywords: Nonsingular H-matrix, parallel alternating method, convergence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1069
2170 Implementation of ADETRAN Language Using Message Passing Interface

Authors: Akiyoshi Wakatani

Abstract:

This paper describes the Message Passing Interface (MPI) implementation of ADETRAN language, and its evaluation on SX-ACE supercomputers. ADETRAN language includes pdo statement that specifies the data distribution and parallel computations and pass statement that specifies the redistribution of arrays. Two methods for implementation of pass statement are discussed and the performance evaluation using Splitting-Up CG method is presented. The effectiveness of the parallelization is evaluated and the advantage of one dimensional distribution is empirically confirmed by using the results of experiments.

Keywords: Iterative methods, array redistribution, translator, distributed memory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1152
2169 A Consideration of the Achievement of Productive Level Parallel Programming Skills

Authors: Tadayoshi Horita, Masakazu Akiba, Mina Terauchi, Tsuneo Kanno

Abstract:

This paper gives a consideration of the achievement of productive level parallel programming skills, based on the data of the graduation studies in the Polytechnic University of Japan. The data show that most students can achieve only parallel programming skills during the graduation study (about 600 to 700 hours), if the programming environment is limited to GPGPUs. However, the data also show that it is a very high level task that a student achieves productive level parallel programming skills during only the graduation study. In addition, it shows that the parallel programming environments for GPGPU, such as CUDA and OpenCL, may be more suitable for parallel computing education than other environments such as MPI on a cluster system and Cell.B.E. These results must be useful for the areas of not only software developments, but also hardware product developments using computer technologies.

Keywords: Parallel computing, programming education, GPU, GPGPU, CUDA, OpenCL, MPI, Cell.B.E.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1642
2168 On the Efficient Implementation of a Serial and Parallel Decomposition Algorithm for Fast Support Vector Machine Training Including a Multi-Parameter Kernel

Authors: Tatjana Eitrich, Bruno Lang

Abstract:

This work deals with aspects of support vector machine learning for large-scale data mining tasks. Based on a decomposition algorithm for support vector machine training that can be run in serial as well as shared memory parallel mode we introduce a transformation of the training data that allows for the usage of an expensive generalized kernel without additional costs. We present experiments for the Gaussian kernel, but usage of other kernel functions is possible, too. In order to further speed up the decomposition algorithm we analyze the critical problem of working set selection for large training data sets. In addition, we analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our tests and conclusions led to several modifications of the algorithm and the improvement of overall support vector machine learning performance. Our method allows for using extensive parameter search methods to optimize classification accuracy.

Keywords: Support Vector Machine Training, Multi-ParameterKernels, Shared Memory Parallel Computing, Large Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1394
2167 Low Power and Less Area Architecture for Integer Motion Estimation

Authors: C Hisham, K Komal, Amit K Mishra

Abstract:

Full search block matching algorithm is widely used for hardware implementation of motion estimators in video compression algorithms. In this paper we are proposing a new architecture, which consists of a 2D parallel processing unit and a 1D unit both working in parallel. The proposed architecture reduces both data access power and computational power which are the main causes of power consumption in integer motion estimation. It also completes the operations with nearly the same number of clock cycles as compared to a 2D systolic array architecture. In this work sum of absolute difference (SAD)-the most repeated operation in block matching, is calculated in two steps. The first step is to calculate the SAD for alternate rows by a 2D parallel unit. If the SAD calculated by the parallel unit is less than the stored minimum SAD, the SAD of the remaining rows is calculated by the 1D unit. Early termination, which stops avoidable computations has been achieved with the help of alternate rows method proposed in this paper and by finding a low initial SAD value based on motion vector prediction. Data reuse has been applied to the reference blocks in the same search area which significantly reduced the memory access.

Keywords: Sum of absolute difference, high speed DSP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1447
2166 Development of Heterogeneous Parallel Genetic Simulated Annealing Using Multi-Niche Crowding

Authors: Z. G. Wang, M. Rahman, Y. S. Wong, K. S. Neo

Abstract:

In this paper, a new hybrid of genetic algorithm (GA) and simulated annealing (SA), referred to as GSA, is presented. In this algorithm, SA is incorporated into GA to escape from local optima. The concept of hierarchical parallel GA is employed to parallelize GSA for the optimization of multimodal functions. In addition, multi-niche crowding is used to maintain the diversity in the population of the parallel GSA (PGSA). The performance of the proposed algorithms is evaluated against a standard set of multimodal benchmark functions. The multi-niche crowding PGSA and normal PGSA show some remarkable improvement in comparison with the conventional parallel genetic algorithm and the breeder genetic algorithm (BGA).

Keywords: Crowding, genetic algorithm, parallel geneticalgorithm, simulated annealing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1539
2165 Designing a Robust Controller for a 6 Linkage Robot

Authors: G. Khamooshian

Abstract:

One of the main points of application of the mechanisms of the series and parallel is the subject of managing them. The control of this mechanism and similar mechanisms is one that has always been the intention of the scholars. On the other hand, modeling the behavior of the system is difficult due to the large number of its parameters, and it leads to complex equations that are difficult to solve and eventually difficult to control. In this paper, a six-linkage robot has been presented that could be used in different areas such as medical robots. Using these robots needs a robust control. In this paper, the system equations are first found, and then the system conversion function is written. A new controller has been designed for this robot which could be used in other parallel robots and could be very useful. Parallel robots are so important in robotics because of their stability, so methods for control of them are important and the robust controller, especially in parallel robots, makes a sense.

Keywords: 3-RRS, 6 linkage, parallel robot, control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 628
2164 Dynamic Analysis of Offshore 2-HUS/U Parallel Platform

Authors: Xie Kefeng, Zhang He

Abstract:

For the stability and control demand of offshore small floating platform, a 2-HUS/U parallel mechanism was presented as offshore platform. Inverse kinematics was obtained by institutional constraint equation, and the dynamic model of offshore 2-HUS/U parallel platform was derived based on rigid body’s Lagrangian method. The equivalent moment of inertia, damping and driving force/torque variation of offshore 2-HUS/U parallel platform were analyzed. A numerical example shows that, for parallel platform of given motion, system’s equivalent inertia changes 1.25 times maximally. During the movement of platform, they change dramatically with the system configuration and have coupling characteristics. The maximum equivalent drive torque is 800 N. At the same time, the curve of platform’s driving force/torque is smooth and has good sine features. The control system needs to be adjusted according to kinetic equation during stability and control and it provides a basis for the optimization of control system.

Keywords: 2-HUS/U platform, Dynamics, Lagrange, Parallel platform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 929
2163 JConqurr - A Multi-Core Programming Toolkit for Java

Authors: G.A.C.P. Ganegoda, D.M.A. Samaranayake, L.S. Bandara, K.A.D.N.K. Wimalawarne

Abstract:

With the popularity of the multi-core and many-core architectures there is a great requirement for software frameworks which can support parallel programming methodologies. In this paper we introduce an Eclipse toolkit, JConqurr which is easy to use and provides robust support for flexible parallel progrmaming. JConqurr is a multi-core and many-core programming toolkit for Java which is capable of providing support for common parallel programming patterns which include task, data, divide and conquer and pipeline parallelism. The toolkit uses an annotation and a directive mechanism to convert the sequential code into parallel code. In addition to that we have proposed a novel mechanism to achieve the parallelism using graphical processing units (GPU). Experiments with common parallelizable algorithms have shown that our toolkit can be easily and efficiently used to convert sequential code to parallel code and significant performance gains can be achieved.

Keywords: Multi-core, parallel programming patterns, GPU, Java, Eclipse plugin, toolkit,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2067
2162 Parallel Hybrid Honeypot and IDS Architecture to Detect Network Attacks

Authors: Hafiz Gulfam Ahmad, Chuangdong Li, Zeeshan Ahmad

Abstract:

In this paper, we have proposed a parallel IDS and honeypot based approach to detect and analyze the unknown and known attack taxonomy for improving the IDS performance and protecting the network from intruders. The main theme of our approach is to record and analyze the intruder activities by using both the low and high interaction honeypots. Our architecture aims to achieve the required goals by combing signature based IDS, honeypots and generate the new signatures. The paper describes the basic component, design and implementation of this approach and also demonstrates the effectiveness of this approach to reduce the probability of network attacks.

Keywords: Network security, Intrusion detection, Honeypot, Snort, Nmap.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2486
2161 A Parallel Approach for 3D-Variational Data Assimilation on GPUs in Ocean Circulation Models

Authors: Rossella Arcucci, Luisa D’Amore, Simone Celestino, Giuseppe Scotti, Giuliano Laccetti

Abstract:

This work is the first dowel in a rather wide research activity in collaboration with Euro Mediterranean Center for Climate Changes, aimed at introducing scalable approaches in Ocean Circulation Models. We discuss designing and implementation of a parallel algorithm for solving the Variational Data Assimilation (DA) problem on Graphics Processing Units (GPUs). The algorithm is based on the fully scalable 3DVar DA model, previously proposed by the authors, which uses a Domain Decomposition approach (we refer to this model as the DD-DA model). We proceed with an incremental porting process consisting of 3 distinct stages: requirements and source code analysis, incremental development of CUDA kernels, testing and optimization. Experiments confirm the theoretic performance analysis based on the so-called scale up factor demonstrating that the DD-DA model can be suitably mapped on GPU architectures.

Keywords: Data Assimilation, Parallel Algorithm, GPU architectures, Ocean Models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1954
2160 Using Multi-Thread Technology Realize Most Short-Path Parallel Algorithm

Authors: Chang-le Lu, Yong Chen

Abstract:

The shortest path question is in a graph theory model question, and it is applied in many fields. The most short-path question may divide into two kinds: Single sources most short-path, all apexes to most short-path. This article mainly introduces the problem of all apexes to most short-path, and gives a new parallel algorithm of all apexes to most short-path according to the Dijkstra algorithm. At last this paper realizes the parallel algorithms in the technology of C # multithreading.

Keywords: Dijkstra algorithm, parallel algorithms, multi-thread technology, most short-path, ratio.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2050
2159 Statistical Reliability Based Modeling of Series and Parallel Operating Systems using Extreme Value Theory

Authors: Mohamad Mahdavi, Mojtaba Mahdavi

Abstract:

This paper tries to represent a new method for computing the reliability of a system which is arranged in series or parallel model. In this method we estimate life distribution function of whole structure using the asymptotic Extreme Value (EV) distribution of Type I, or Gumbel theory. We use EV distribution in minimal mode, for estimate the life distribution function of series structure and maximal mode for parallel system. All parameters also are estimated by Moments method. Reliability function and failure (hazard) rate and p-th percentile point of each function are determined. Other important indexes such as Mean Time to Failure (MTTF), Mean Time to repair (MTTR), for non-repairable and renewal systems in both of series and parallel structure will be computed.

Keywords: Reliability, extreme value, parallel, series, lifedistribution

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2047
2158 Parallel Joint Channel Coding and Cryptography

Authors: Nataša Živić, Christoph Ruland

Abstract:

Method of Parallel Joint Channel Coding and Cryptography has been analyzed and simulated in this paper. The method is an extension of Soft Input Decryption with feedback, which is used for improvement of channel decoding of secured messages. Parallel Joint Channel Coding and Cryptography results in improved coding gain of channel decoding, which achieves more than 2 dB. Such results are an implication of a combination of receiver components and their interoperability.

Keywords: Block length, Coding gain, Feedback, L-values, Parallel Joint Channel Coding and Cryptography, Soft Input Decryption.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1531
2157 Performance Analysis of Selective Adaptive Multiple Access Interference Cancellation for Multicarrier DS-CDMA Systems

Authors: Maged Ahmed, Ahmed El-Mahdy

Abstract:

In this paper, Selective Adaptive Parallel Interference Cancellation (SA-PIC) technique is presented for Multicarrier Direct Sequence Code Division Multiple Access (MC DS-CDMA) scheme. The motivation of using SA-PIC is that it gives high performance and at the same time, reduces the computational complexity required to perform interference cancellation. An upper bound expression of the bit error rate (BER) for the SA-PIC under Rayleigh fading channel condition is derived. Moreover, the implementation complexities for SA-PIC and Adaptive Parallel Interference Cancellation (APIC) are discussed and compared. The performance of SA-PIC is investigated analytically and validated via computer simulations.

Keywords: Adaptive interference cancellation, communicationsystems, multicarrier signal processing, spread spectrum

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1805
2156 FPGA Based Parallel Architecture for the Computation of Third-Order Cross Moments

Authors: Syed Manzoor Qasim, Shuja Abbasi, Saleh Alshebeili, Bandar Almashary, Ateeq Ahmad Khan

Abstract:

Higher-order Statistics (HOS), also known as cumulants, cross moments and their frequency domain counterparts, known as poly spectra have emerged as a powerful signal processing tool for the synthesis and analysis of signals and systems. Algorithms used for the computation of cross moments are computationally intensive and require high computational speed for real-time applications. For efficiency and high speed, it is often advantageous to realize computation intensive algorithms in hardware. A promising solution that combines high flexibility together with the speed of a traditional hardware is Field Programmable Gate Array (FPGA). In this paper, we present FPGA-based parallel architecture for the computation of third-order cross moments. The proposed design is coded in Very High Speed Integrated Circuit (VHSIC) Hardware Description Language (VHDL) and functionally verified by implementing it on Xilinx Spartan-3 XC3S2000FG900-4 FPGA. Implementation results are presented and it shows that the proposed design can operate at a maximum frequency of 86.618 MHz.

Keywords: Cross moments, Cumulants, FPGA, Hardware Implementation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1688
2155 Parallel Text Processing: Alignment of Indonesian to Javanese Language

Authors: Aji P. Wibawa, Andrew Nafalski, Neil Murray, Wayan F. Mahmudy

Abstract:

Parallel text alignment is proposed as a way of aligning bahasa Indonesia to words in Javanese. Since the one-to-one word translator does not have the facility to translate pragmatic aspects of Javanese, the parallel text alignment model described uses a phrase pair combination. The algorithm aligns the parallel text automatically from the beginning to the end of each sentence. Even though the results of the phrase pair combination outperform the previous algorithm, it is still inefficient. Recording all possible combinations consume more space in the database and time consuming. The original algorithm is modified by applying the edit distance coefficient to improve the data-storage efficiency. As a result, the data-storage consumption is 90% reduced as well as its learning period (42s).

Keywords: Parallel text alignment, phrase pair combination, edit distance coefficient, Javanese-Indonesian language.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2437
2154 Coupling Compensation of 6-DOF Parallel Robot Based on Screw Theory

Authors: Ming Cong, Yinghua Wu, Dong Liu, Haiying Wen, Junfa Yu

Abstract:

In order to improve control performance and eliminate steady, a coupling compensation for 6-DOF parallel robot is presented. Taking dynamic load Tank Simulator as the research object, this paper analyzes the coupling of 6-DOC parallel robot considering the degree of freedom of the 6-DOF parallel manipulator. The coupling angle and coupling velocity are derived based on inverse kinematics model. It uses the mechanism-model combined method which takes practical moving track that considering the performance of motion controller and motor as its input to make the study. Experimental results show that the coupling compensation improves motion stability as well as accuracy. Besides, it decreases the dither amplitude of dynamic load Tank Simulator.

Keywords: coupling compensation, screw theory, parallel robot, mechanism-model combined motion

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1634
2153 Asynchronous Parallel Distributed Genetic Algorithm with Elite Migration

Authors: Kazunori Kojima, Masaaki Ishigame, Goutam Chakraborty, Hiroshi Hatsuo, Shozo Makino

Abstract:

In most of the popular implementation of Parallel GAs the whole population is divided into a set of subpopulations, each subpopulation executes GA independently and some individuals are migrated at fixed intervals on a ring topology. In these studies, the migrations usually occur 'synchronously' among subpopulations. Therefore, CPUs are not used efficiently and the communication do not occur efficiently either. A few studies tried asynchronous migration but it is hard to implement and setting proper parameter values is difficult. The aim of our research is to develop a migration method which is easy to implement, which is easy to set parameter values, and which reduces communication traffic. In this paper, we propose a traffic reduction method for the Asynchronous Parallel Distributed GA by migration of elites only. This is a Server-Client model. Every client executes GA on a subpopulation and sends an elite information to the server. The server manages the elite information of each client and the migrations occur according to the evolution of sub-population in a client. This facilitates the reduction in communication traffic. To evaluate our proposed model, we apply it to many function optimization problems. We confirm that our proposed method performs as well as current methods, the communication traffic is less, and setting of the parameters are much easier.

Keywords: Parallel Distributed Genetic Algorithm (PDGA), asynchronousPDGA, Server-Client configuration, Elite Migration

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1317
2152 Some Results on Parallel Alternating Two-stage Methods

Authors: Guangbin Wang, Xue Li

Abstract:

In this paper, we present parallel alternating two-stage methods for solving linear system Ax=b, where A is a symmetric positive definite matrix. And we give some convergence results of these methods for nonsingular linear system.

Keywords: alternating two-stage, convergence, linear system, parallel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1145
2151 Implementation of Watch Dog Timer for Fault Tolerant Computing on Cluster Server

Authors: Meenakshi Bheevgade, Rajendra M. Patrikar

Abstract:

In today-s new technology era, cluster has become a necessity for the modern computing and data applications since many applications take more time (even days or months) for computation. Although after parallelization, computation speeds up, still time required for much application can be more. Thus, reliability of the cluster becomes very important issue and implementation of fault tolerant mechanism becomes essential. The difficulty in designing a fault tolerant cluster system increases with the difficulties of various failures. The most imperative obsession is that the algorithm, which avoids a simple failure in a system, must tolerate the more severe failures. In this paper, we implemented the theory of watchdog timer in a parallel environment, to take care of failures. Implementation of simple algorithm in our project helps us to take care of different types of failures; consequently, we found that the reliability of this cluster improves.

Keywords: Cluster, Fault tolerant, Grid, Grid ComputingSystem, Meta-computing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2178
2150 Processor Scheduling on Parallel Computers

Authors: Mohammad S. Laghari, Gulzar A. Khuwaja

Abstract:

Many problems in computer vision and image processing present potential for parallel implementations through one of the three major paradigms of geometric parallelism, algorithmic parallelism and processor farming. Static process scheduling techniques are used successfully to exploit geometric and algorithmic parallelism, while dynamic process scheduling is better suited to dealing with the independent processes inherent in the process farming paradigm. This paper considers the application of parallel or multi-computers to a class of problems exhibiting spatial data characteristic of the geometric paradigm. However, by using processor farming paradigm, a dynamic scheduling technique is developed to suit the MIMD structure of the multi-computers. A hybrid scheme of scheduling is also developed and compared with the other schemes. The specific problem chosen for the investigation is the Hough transform for line detection.

Keywords: Hough transforms, parallel computer, parallel paradigms, scheduling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1608
2149 Improved Pattern Matching Applied to Surface Mounting Devices Components Localization on Automated Optical Inspection

Authors: Pedro M. A. Vitoriano, Tito. G. Amaral

Abstract:

Automated Optical Inspection (AOI) Systems are commonly used on Printed Circuit Boards (PCB) manufacturing. The use of this technology has been proven as highly efficient for process improvements and quality achievements. The correct extraction of the component for posterior analysis is a critical step of the AOI process. Nowadays, the Pattern Matching Algorithm is commonly used, although this algorithm requires extensive calculations and is time consuming. This paper will present an improved algorithm for the component localization process, with the capability of implementation in a parallel execution system.

Keywords: AOI, automated optical inspection, SMD, surface mounting devices, pattern matching, parallel execution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1024
2148 Workspace Analysis of 6–6 Cable-Suspended Parallel Robots

Authors: Arian Bahrami, Amir Teimourian

Abstract:

In this paper, the effect of the moving platform size on the workspace volume of 6–6 cable-suspended parallel robots is investigated in details for different geometric configurations and orientations of the moving platform. The obtained hints can be used as a rule of thumb in designing this type of robot.

Keywords: Cable-suspended parallel robot, system analysis and design, workspace analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1124
2147 Neural Networks Approaches for Computing the Forward Kinematics of a Redundant Parallel Manipulator

Authors: H. Sadjadian , H.D. Taghirad Member, A. Fatehi

Abstract:

In this paper, different approaches to solve the forward kinematics of a three DOF actuator redundant hydraulic parallel manipulator are presented. On the contrary to series manipulators, the forward kinematic map of parallel manipulators involves highly coupled nonlinear equations, which are almost impossible to solve analytically. The proposed methods are using neural networks identification with different structures to solve the problem. The accuracy of the results of each method is analyzed in detail and the advantages and the disadvantages of them in computing the forward kinematic map of the given mechanism is discussed in detail. It is concluded that ANFIS presents the best performance compared to MLP, RBF and PNN networks in this particular application.

Keywords: Forward Kinematics, Neural Networks, Numerical Solution, Parallel Manipulators.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1884
2146 Specialization-based parallel Processing without Memo-trees

Authors: Hidemi Ogasawara, Kiyoshi Akama, Hiroshi Mabuchi

Abstract:

The purpose of this paper is to propose a framework for constructing correct parallel processing programs based on Equivalent Transformation Framework (ETF). ETF regards computation as In the framework, a problem-s domain knowledge and a query are described in definite clauses, and computation is regarded as transformation of the definite clauses. Its meaning is defined by a model of the set of definite clauses, and the transformation rules generated must preserve meaning. We have proposed a parallel processing method based on “specialization", a part of operation in the transformations, which resembles substitution in logic programming. The method requires “Memo-tree", a history of specialization to maintain correctness. In this paper we proposes the new method for the specialization-base parallel processing without Memo-tree.

Keywords: Parallel processing, Program correctness, Equivalent transformation, Specializer generation rule

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1280
2145 A Finite Precision Block Floating Point Treatment to Direct Form, Cascaded and Parallel FIR Digital Filters

Authors: Abhijit Mitra

Abstract:

This paper proposes an efficient finite precision block floating point (BFP) treatment to the fixed coefficient finite impulse response (FIR) digital filter. The treatment includes effective implementation of all the three forms of the conventional FIR filters, namely, direct form, cascaded and par- allel, and a roundoff error analysis of them in the BFP format. An effective block formatting algorithm together with an adaptive scaling factor is pro- posed to make the realizations more simple from hardware view point. To this end, a generic relation between the tap weight vector length and the input block length is deduced. The implementation scheme also emphasises on a simple block exponent update technique to prevent overflow even during the block to block transition phase. The roundoff noise is also investigated along the analogous lines, taking into consideration these implementational issues. The simulation results show that the BFP roundoff errors depend on the sig- nal level almost in the same way as floating point roundoff noise, resulting in approximately constant signal to noise ratio over a relatively large dynamic range.

Keywords: Finite impulse response digital filters, Cascade structure, Parallel structure, Block floating point arithmetic, Roundoff error.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1603
2144 Achieving Fair Share Objectives via Goal-Oriented Parallel Computer Job Scheduling Policies

Authors: Sangsuree Vasupongayya

Abstract:

Fair share is one of the scheduling objectives supported on many production systems. However, fair share has been shown to cause performance problems for some users, especially the users with difficult jobs. This work is focusing on extending goaloriented parallel computer job scheduling policies to cover the fair share objective. Goal-oriented parallel computer job scheduling policies have been shown to achieve good scheduling performances when conflicting objectives are required. Goal-oriented policies achieve such good performance by using anytime combinatorial search techniques to find a good compromised schedule within a time limit. The experimental results show that the proposed goal-oriented parallel computer job scheduling policy (namely Tradeofffs( Tw:avgX)) achieves good scheduling performances and also provides good fair share performance.

Keywords: goal-oriented parallel job scheduling policies, fairshare.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1153
2143 Performance Analysis of the Subgroup Method for Collective I/O

Authors: Kwangho Cha, Hyeyoung Cho, Sungho Kim

Abstract:

As many scientific applications require large data processing, the importance of parallel I/O has been increasingly recognized. Collective I/O is one of the considerable features of parallel I/O and enables application programmers to easily handle their large data volume. In this paper we measured and analyzed the performance of original collective I/O and the subgroup method, the way of using collective I/O of MPI effectively. From the experimental results, we found that the subgroup method showed good performance with small data size.

Keywords: Collective I/O, MPI, parallel file system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1529
2142 Kinematic Analysis of a Novel Complex DoF Parallel Manipulator

Authors: M.A. Hosseini, P. Ebrahimi Naghani

Abstract:

In this research work, a novel parallel manipulator with high positioning and orienting rate is introduced. This mechanism has two rotational and one translational degree of freedom. Kinematics and Jacobian analysis are investigated. Moreover, workspace analysis and optimization has been performed by using genetic algorithm toolbox in Matlab software. Because of decreasing moving elements, it is expected much more better dynamic performance with respect to other counterpart mechanisms with the same degrees of freedom. In addition, using couple of cylindrical and revolute joints increased mechanism ability to have more extended workspace.

Keywords: Kinematics, Workspace, 3-CRS/PU, Parallel robot

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1832