Search results for: Parallel processing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2053

Search results for: Parallel processing

2053 Specialization-based parallel Processing without Memo-trees

Authors: Hidemi Ogasawara, Kiyoshi Akama, Hiroshi Mabuchi

Abstract:

The purpose of this paper is to propose a framework for constructing correct parallel processing programs based on Equivalent Transformation Framework (ETF). ETF regards computation as In the framework, a problem-s domain knowledge and a query are described in definite clauses, and computation is regarded as transformation of the definite clauses. Its meaning is defined by a model of the set of definite clauses, and the transformation rules generated must preserve meaning. We have proposed a parallel processing method based on “specialization", a part of operation in the transformations, which resembles substitution in logic programming. The method requires “Memo-tree", a history of specialization to maintain correctness. In this paper we proposes the new method for the specialization-base parallel processing without Memo-tree.

Keywords: Parallel processing, Program correctness, Equivalent transformation, Specializer generation rule

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1267
2052 Performance Improvement of Moving Object Recognition and Tracking Algorithm using Parallel Processing of SURF and Optical Flow

Authors: Jungho Choi, Youngwan Cho

Abstract:

The paper proposes a way of parallel processing of SURF and Optical Flow for moving object recognition and tracking. The object recognition and tracking is one of the most important task in computer vision, however disadvantage are many operations cause processing speed slower so that it can-t do real-time object recognition and tracking. The proposed method uses a typical way of feature extraction SURF and moving object Optical Flow for reduce disadvantage and real-time moving object recognition and tracking, and parallel processing techniques for speed improvement. First analyse that an image from DB and acquired through the camera using SURF for compared to the same object recognition then set ROI (Region of Interest) for tracking movement of feature points using Optical Flow. Secondly, using Multi-Thread is for improved processing speed and recognition by parallel processing. Finally, performance is evaluated and verified efficiency of algorithm throughout the experiment.

Keywords: moving object recognition, moving object tracking, SURF, Optical Flow, Multi-Thread.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2583
2051 Concurrency without Locking in Parallel Hash Structures used for Data Processing

Authors: Ákos Dudás, Sándor Juhász

Abstract:

Various mechanisms providing mutual exclusion and thread synchronization can be used to support parallel processing within a single computer. Instead of using locks, semaphores, barriers or other traditional approaches in this paper we focus on alternative ways for making better use of modern multithreaded architectures and preparing hash tables for concurrent accesses. Hash structures will be used to demonstrate and compare two entirely different approaches (rule based cooperation and hardware synchronization support) to an efficient parallel implementation using traditional locks. Comparison includes implementation details, performance ranking and scalability issues. We aim at understanding the effects the parallelization schemes have on the execution environment with special focus on the memory system and memory access characteristics.

Keywords: Lock-free synchronization, mutual exclusion, parallel hash tables, parallel performance

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1765
2050 Parallel Priority Region Approach to Detect Background

Authors: Sallama Athab, Hala Bahjat, Zhang Yinghui

Abstract:

Background detection is essential in video analyses; optimization is often needed in order to achieve real time calculation. Information gathered by dual cameras placed in the front and rear part of an Autonomous Vehicle (AV) is integrated for background detection. In this paper, real time calculation is achieved on the proposed technique by using Priority Regions (PR) and Parallel Processing together where each frame is divided into regions then and each region process is processed in parallel. PR division depends upon driver view limitations. A background detection system is built on the Temporal Difference (TD) and Gaussian Filtering (GF). Temporal Difference and Gaussian Filtering with multi threshold and sigma (weight) value are be based on PR characteristics. The experiment result is prepared on real scene. Comparison of the speed and accuracy with traditional background detection techniques, the effectiveness of PR and parallel processing are also discussed in this paper.

Keywords: Autonomous Vehicle, Background Detection, Dual Camera, Gaussian Filtering, Parallel Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1630
2049 Performance Analysis of Parallel Client-Server Model Versus Parallel Mobile Agent Model

Authors: K. B. Manwade, G. A. Patil

Abstract:

Mobile agent has motivated the creation of a new methodology for parallel computing. We introduce a methodology for the creation of parallel applications on the network. The proposed Mobile-Agent parallel processing framework uses multiple Javamobile Agents. Each mobile agent can travel to the specified machine in the network to perform its tasks. We also introduce the concept of master agent, which is Java object capable of implementing a particular task of the target application. Master agent is dynamically assigns the task to mobile agents. We have developed and tested a prototype application: Mobile Agent Based Parallel Computing. Boosted by the inherited benefits of using Java and Mobile Agents, our proposed methodology breaks the barriers between the environments, and could potentially exploit in a parallel manner all the available computational resources on the network. This paper elaborates performance issues of a mobile agent for parallel computing.

Keywords: Parallel Computing, Mobile Agent.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1588
2048 Parallel Vector Processing Using Multi Level Orbital DATA

Authors: Nagi Mekhiel

Abstract:

Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.

Keywords: Memory organization, parallel processors, serial code, vector processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1008
2047 A Parallel Quadtree Approach for Image Compression using Wavelets

Authors: Hamed Vahdat Nejad, Hossein Deldari

Abstract:

Wavelet transforms are multiresolution decompositions that can be used to analyze signals and images. Image compression is one of major applications of wavelet transforms in image processing. It is considered as one of the most powerful methods that provides a high compression ratio. However, its implementation is very time-consuming. At the other hand, parallel computing technologies are an efficient method for image compression using wavelets. In this paper, we propose a parallel wavelet compression algorithm based on quadtrees. We implement the algorithm using MatlabMPI (a parallel, message passing version of Matlab), and compute its isoefficiency function, and show that it is scalable. Our experimental results confirm the efficiency of the algorithm also.

Keywords: Image compression, MPI, Parallel computing, Wavelets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1972
2046 A Multi-Level WEB Based Parallel Processing System A Hierarchical Volunteer Computing Approach

Authors: Abdelrahman Ahmed Mohamed Osman

Abstract:

Over the past few years, a number of efforts have been exerted to build parallel processing systems that utilize the idle power of LAN-s and PC-s available in many homes and corporations. The main advantage of these approaches is that they provide cheap parallel processing environments for those who cannot afford the expenses of supercomputers and parallel processing hardware. However, most of the solutions provided are not very flexible in the use of available resources and very difficult to install and setup. In this paper, a multi-level web-based parallel processing system (MWPS) is designed (appendix). MWPS is based on the idea of volunteer computing, very flexible, easy to setup and easy to use. MWPS allows three types of subscribers: simple volunteers (single computers), super volunteers (full networks) and end users. All of these entities are coordinated transparently through a secure web site. Volunteer nodes provide the required processing power needed by the system end users. There is no limit on the number of volunteer nodes, and accordingly the system can grow indefinitely. Both volunteer and system users must register and subscribe. Once, they subscribe, each entity is provided with the appropriate MWPS components. These components are very easy to install. Super volunteer nodes are provided with special components that make it possible to delegate some of the load to their inner nodes. These inner nodes may also delegate some of the load to some other lower level inner nodes .... and so on. It is the responsibility of the parent super nodes to coordinate the delegation process and deliver the results back to the user. MWPS uses a simple behavior-based scheduler that takes into consideration the current load and previous behavior of processing nodes. Nodes that fulfill their contracts within the expected time get a high degree of trust. Nodes that fail to satisfy their contract get a lower degree of trust. MWPS is based on the .NET framework and provides the minimal level of security expected in distributed processing environments. Users and processing nodes are fully authenticated. Communications and messages between nodes are very secure. The system has been implemented using C#. MWPS may be used by any group of people or companies to establish a parallel processing or grid environment.

Keywords: Volunteer computing, Parallel Processing, XMLWebServices, .NET Remoting, Tuplespace.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1434
2045 Parallel 2-Opt Local Search on GPU

Authors: Wen-Bao Qiao, Jean-Charles Créput

Abstract:

To accelerate the solution for large scale traveling salesman problems (TSP), a parallel 2-opt local search algorithm with simple implementation based on Graphics Processing Unit (GPU) is presented and tested in this paper. The parallel scheme is based on technique of data decomposition by dynamically assigning multiple K processors on the integral tour to treat K edges’ 2-opt local optimization simultaneously on independent sub-tours, where K can be user-defined or have a function relationship with input size N. We implement this algorithm with doubly linked list on GPU. The implementation only requires O(N) memory. We compare this parallel 2-opt local optimization against sequential exhaustive 2-opt search along integral tour on TSP instances from TSPLIB with more than 10000 cities.

Keywords: Doubly linked list, parallel 2-opt, tour division, GPU.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1166
2044 Performance Comparison of Parallel Sorting Algorithms on the Cluster of Workstations

Authors: Lai Lai Win Kyi, Nay Min Tun

Abstract:

Sorting appears the most attention among all computational tasks over the past years because sorted data is at the heart of many computations. Sorting is of additional importance to parallel computing because of its close relation to the task of routing data among processes, which is an essential part of many parallel algorithms. Many parallel sorting algorithms have been investigated for a variety of parallel computer architectures. In this paper, three parallel sorting algorithms have been implemented and compared in terms of their overall execution time. The algorithms implemented are the odd-even transposition sort, parallel merge sort and parallel rank sort. Cluster of Workstations or Windows Compute Cluster has been used to compare the algorithms implemented. The C# programming language is used to develop the sorting algorithms. The MPI (Message Passing Interface) library has been selected to establish the communication and synchronization between processors. The time complexity for each parallel sorting algorithm will also be mentioned and analyzed.

Keywords: Cluster of Workstations, Parallel sorting algorithms, performance analysis, parallel computing and MPI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1428
2043 Concurrent Approach to Data Parallel Model using Java

Authors: Bala Dhandayuthapani Veerasamy

Abstract:

Parallel programming models exist as an abstraction of hardware and memory architectures. There are several parallel programming models in commonly use; they are shared memory model, thread model, message passing model, data parallel model, hybrid model, Flynn-s models, embarrassingly parallel computations model, pipelined computations model. These models are not specific to a particular type of machine or memory architecture. This paper expresses the model program for concurrent approach to data parallel model through java programming.

Keywords: Concurrent, Data Parallel, JDK, Parallel, Thread

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2036
2042 Damage Strain Analysis of Parallel Fiber Eutectic

Authors: Jian Zheng, Xinhua Ni, Xiequan Liu

Abstract:

According to isotropy of parallel fiber eutectic, the no- damage strain field in parallel fiber eutectic is obtained from the flexibility tensor of parallel fiber eutectic. Considering the damage behavior of parallel fiber eutectic, damage variables are introduced to determine the strain field of parallel fiber eutectic. The damage strains in the matrix, interphase, and fiber of parallel fiber eutectic are quantitatively analyzed. Results show that damage strains are not only associated with the fiber volume fraction of parallel fiber eutectic, but also with the damage degree.

Keywords: Parallel fiber eutectic, no-damage strain, damage strain, fiber volume fraction, damage degree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 899
2041 JConqurr - A Multi-Core Programming Toolkit for Java

Authors: G.A.C.P. Ganegoda, D.M.A. Samaranayake, L.S. Bandara, K.A.D.N.K. Wimalawarne

Abstract:

With the popularity of the multi-core and many-core architectures there is a great requirement for software frameworks which can support parallel programming methodologies. In this paper we introduce an Eclipse toolkit, JConqurr which is easy to use and provides robust support for flexible parallel progrmaming. JConqurr is a multi-core and many-core programming toolkit for Java which is capable of providing support for common parallel programming patterns which include task, data, divide and conquer and pipeline parallelism. The toolkit uses an annotation and a directive mechanism to convert the sequential code into parallel code. In addition to that we have proposed a novel mechanism to achieve the parallelism using graphical processing units (GPU). Experiments with common parallelizable algorithms have shown that our toolkit can be easily and efficiently used to convert sequential code to parallel code and significant performance gains can be achieved.

Keywords: Multi-core, parallel programming patterns, GPU, Java, Eclipse plugin, toolkit,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2056
2040 Performance Analysis of the Subgroup Method for Collective I/O

Authors: Kwangho Cha, Hyeyoung Cho, Sungho Kim

Abstract:

As many scientific applications require large data processing, the importance of parallel I/O has been increasingly recognized. Collective I/O is one of the considerable features of parallel I/O and enables application programmers to easily handle their large data volume. In this paper we measured and analyzed the performance of original collective I/O and the subgroup method, the way of using collective I/O of MPI effectively. From the experimental results, we found that the subgroup method showed good performance with small data size.

Keywords: Collective I/O, MPI, parallel file system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1520
2039 High Performance in Parallel Data Integration: An Empirical Evaluation of the Ratio Between Processing Time and Number of Physical Nodes

Authors: Caspar von Seckendorff, Eldar Sultanow

Abstract:

Many studies have shown that parallelization decreases efficiency [1], [2]. There are many reasons for these decrements. This paper investigates those which appear in the context of parallel data integration. Integration processes generally cannot be allocated to packages of identical size (i. e. tasks of identical complexity). The reason for this is unknown heterogeneous input data which result in variable task lengths. Process delay is defined by the slowest processing node. It leads to a detrimental effect on the total processing time. With a real world example, this study will show that while process delay does initially increase with the introduction of more nodes it ultimately decreases again after a certain point. The example will make use of the cloud computing platform Hadoop and be run inside Amazon-s EC2 compute cloud. A stochastic model will be set up which can explain this effect.

Keywords: Process delay, speedup, efficiency, parallel computing, data integration, E-Commerce, Amazon Elastic Compute Cloud (EC2), Hadoop, Nutch.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1576
2038 Parallel Text Processing: Alignment of Indonesian to Javanese Language

Authors: Aji P. Wibawa, Andrew Nafalski, Neil Murray, Wayan F. Mahmudy

Abstract:

Parallel text alignment is proposed as a way of aligning bahasa Indonesia to words in Javanese. Since the one-to-one word translator does not have the facility to translate pragmatic aspects of Javanese, the parallel text alignment model described uses a phrase pair combination. The algorithm aligns the parallel text automatically from the beginning to the end of each sentence. Even though the results of the phrase pair combination outperform the previous algorithm, it is still inefficient. Recording all possible combinations consume more space in the database and time consuming. The original algorithm is modified by applying the edit distance coefficient to improve the data-storage efficiency. As a result, the data-storage consumption is 90% reduced as well as its learning period (42s).

Keywords: Parallel text alignment, phrase pair combination, edit distance coefficient, Javanese-Indonesian language.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2424
2037 Processor Scheduling on Parallel Computers

Authors: Mohammad S. Laghari, Gulzar A. Khuwaja

Abstract:

Many problems in computer vision and image processing present potential for parallel implementations through one of the three major paradigms of geometric parallelism, algorithmic parallelism and processor farming. Static process scheduling techniques are used successfully to exploit geometric and algorithmic parallelism, while dynamic process scheduling is better suited to dealing with the independent processes inherent in the process farming paradigm. This paper considers the application of parallel or multi-computers to a class of problems exhibiting spatial data characteristic of the geometric paradigm. However, by using processor farming paradigm, a dynamic scheduling technique is developed to suit the MIMD structure of the multi-computers. A hybrid scheme of scheduling is also developed and compared with the other schemes. The specific problem chosen for the investigation is the Hough transform for line detection.

Keywords: Hough transforms, parallel computer, parallel paradigms, scheduling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1597
2036 Implementation of Parallel Interface for Microprocessor Trainer

Authors: Moe Moe Htun, Khin Htar Nwe

Abstract:

In this paper, parallel interface for microprocessor trainer was implemented. A programmable parallel–port device such as the IC 8255A is initialized for simple input or output and for handshake input or output by choosing kinds of modes. The hardware connections and the programs can be used to interface microprocessor trainer and a personal computer by using IC 8255A. The assembly programs edited on PC-s editor can be downloaded to the trainer.

Keywords: Parallel I/O ports, parallel interface, trainer, two 8255 ICs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3107
2035 A Heuristic Algorithm Approach for Scheduling of Multi-criteria Unrelated Parallel Machines

Authors: Farhad Kolahan, Vahid Kayvanfar

Abstract:

In this paper we address a multi-objective scheduling problem for unrelated parallel machines. In unrelated parallel systems, the processing cost/time of a given job on different machines may vary. The objective of scheduling is to simultaneously determine the job-machine assignment and job sequencing on each machine. In such a way the total cost of the schedule is minimized. The cost function consists of three components, namely; machining cost, earliness/tardiness penalties and makespan related cost. Such scheduling problem is combinatorial in nature. Therefore, a Simulated Annealing approach is employed to provide good solutions within reasonable computational times. Computational results show that the proposed approach can efficiently solve such complicated problems.

Keywords: Makespan, Parallel machines, Scheduling, Simulated Annealing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1586
2034 A Technique for Reachability Graph Generation for the Petri Net Models of Parallel Processes

Authors: Farooq Ahmad, Hejiao Huang, Xiaolong Wang

Abstract:

Reachability graph (RG) generation suffers from the problem of exponential space and time complexity. To alleviate the more critical problem of time complexity, this paper presents the new approach for RG generation for the Petri net (PN) models of parallel processes. Independent RGs for each parallel process in the PN structure are generated in parallel and cross-product of these RGs turns into the exhaustive state space from which the RG of given parallel system is determined. The complexity analysis of the presented algorithm illuminates significant decrease in the time complexity cost of RG generation. The proposed technique is applicable to parallel programs having multiple threads with the synchronization problem.

Keywords: Parallel processes, Petri net, reachability graph, time complexity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1959
2033 A Parallel Implementation of k-Means in MATLAB

Authors: Dimitris Varsamis, Christos Talagkozis, Alkiviadis Tsimpiris, Paris Mastorocostas

Abstract:

The aim of this work is the parallel implementation of k-means in MATLAB, in order to reduce the execution time. Specifically, a new function in MATLAB for serial k-means algorithm is developed, which meets all the requirements for the conversion to a function in MATLAB with parallel computations. Additionally, two different variants for the definition of initial values are presented. In the sequel, the parallel approach is presented. Finally, the performance tests for the computation times respect to the numbers of features and classes are illustrated.

Keywords: K-means algorithm, clustering, parallel computations, MATLAB.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1080
2032 Local Linear Model Tree (LOLIMOT) Reconfigurable Parallel Hardware

Authors: A. Pedram, M. R. Jamali, T. Pedram, S. M. Fakhraie, C. Lucas

Abstract:

Local Linear Neuro-Fuzzy Models (LLNFM) like other neuro- fuzzy systems are adaptive networks and provide robust learning capabilities and are widely utilized in various applications such as pattern recognition, system identification, image processing and prediction. Local linear model tree (LOLIMOT) is a type of Takagi-Sugeno-Kang neuro fuzzy algorithm which has proven its efficiency compared with other neuro fuzzy networks in learning the nonlinear systems and pattern recognition. In this paper, a dedicated reconfigurable and parallel processing hardware for LOLIMOT algorithm and its applications are presented. This hardware realizes on-chip learning which gives it the capability to work as a standalone device in a system. The synthesis results on FPGA platforms show its potential to improve the speed at least 250 of times faster than software implemented algorithms.

Keywords: LOLIMOT, hardware, neurofuzzy systems, reconfigurable, parallel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3818
2031 Harmonic Reduction In Three-Phase Parallel Connected Inverter

Authors: M.A.A. Younis, N. A. Rahim, S. Mekhilef

Abstract:

This paper presents the design and analysis of a parallel connected inverter configuration of. The configuration consists of parallel connected three-phase dc/ac inverter. Series resistors added to the inverter output to maintain same current in each inverter of the two parallel inverters, and to reduce the circulating current in the parallel inverters to the minimum. High frequency third harmonic injection PWM (THIPWM) employed to reduce the total harmonic distortion and to make maximum use of the voltage source. DSP was used to generate the THIPWM and the control algorithm for the converter. Selected experimental results have been shown to validate the proposed system.

Keywords: Three-phase inverter, Third harmonic injection PWM, inverters parallel connection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3722
2030 Pulsed Multi-Layered Image Filtering: A VLSI Implementation

Authors: Christian Mayr, Holger Eisenreich, Stephan Henker, René Schüffny

Abstract:

Image convolution similar to the receptive fields found in mammalian visual pathways has long been used in conventional image processing in the form of Gabor masks. However, no VLSI implementation of parallel, multi-layered pulsed processing has been brought forward which would emulate this property. We present a technical realization of such a pulsed image processing scheme. The discussed IC also serves as a general testbed for VLSI-based pulsed information processing, which is of interest especially with regard to the robustness of representing an analog signal in the phase or duration of a pulsed, quasi-digital signal, as well as the possibility of direct digital manipulation of such an analog signal. The network connectivity and processing properties are reconfigurable so as to allow adaptation to various processing tasks.

Keywords: Neural image processing, pulse computation application, pulsed Gabor convolution, VLSI pulse routing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1336
2029 Network Based High Performance Computing

Authors: Karanjeet Singh Kahlon, Gurvinder Singh, Arjan Singh

Abstract:

In the past few years there is a change in the view of high performance applications and parallel computing. Initially such applications were targeted towards dedicated parallel machines. Recently trend is changing towards building meta-applications composed of several modules that exploit heterogeneous platforms and employ hybrid forms of parallelism. The aim of this paper is to propose a model of virtual parallel computing. Virtual parallel computing system provides a flexible object oriented software framework that makes it easy for programmers to write various parallel applications.

Keywords: Applet, Efficiency, Java, LAN

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1847
2028 Some Results on Parallel Alternating Methods

Authors: Guangbin Wang, Fuping Tan

Abstract:

In this paper, we investigate two parallel alternating methods for solving the system of linear equations Ax = b and give convergence theorems for the parallel alternating methods when the coefficient matrix is a nonsingular H-matrix. Furthermore, we give one example to show our results.

Keywords: Nonsingular H-matrix, parallel alternating method, convergence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1059
2027 Design and Implementation of Shared Memory based Parallel File System Logging Method for High Performance Computing

Authors: Hyeyoung Cho, Sungho Kim, SangDong Lee

Abstract:

I/O workload is a critical and important factor to analyze I/O pattern and file system performance. However tracing I/O operations on the fly distributed parallel file system is non-trivial due to collection overhead and a large volume of data. In this paper, we design and implement a parallel file system logging method for high performance computing using shared memory-based multi-layer scheme. It minimizes the overhead with reduced logging operation response time and provides efficient post-processing scheme through shared memory. Separated logging server can collect sequential logs from multiple clients in a cluster through packet communication. Implementation and evaluation result shows low overhead and high scalability of this architecture for high performance parallel logging analysis.

Keywords: I/O workload, PVFS, I/O Trace.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1513
2026 High Level Synthesis of Digital Filters Based On Sub-Token Forwarding

Authors: Iyad F. Jafar, Sandra J. Alrawashdeh, Ban K. Alhamayel

Abstract:

High level synthesis (HLS) is a process which generates register-transfer level design for digital systems from behavioral description. There are many HLS algorithms and commercial tools. However, most of these algorithms consider a behavioral description for the system when a single token is presented to the system. This approach does not exploit extra hardware efficiently, especially in the design of digital filters where common operations may exist between successive tokens. In this paper, we modify the behavioral description to process multiple tokens in parallel. However, this approach is unlike the full processing that requires full hardware replication. It exploits the presence of common operations between successive tokens. The performance of the proposed approach is better than sequential processing and approaches that of full parallel processing as the hardware resources are increased.

Keywords: Digital filters, High level synthesis, Sub-token forwarding

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1403
2025 Performance Evaluation of Popular Hash Functions

Authors: Sheena Mathew, K. Poulose Jacob

Abstract:

This paper describes the results of an extensive study and comparison of popular hash functions SHA-1, SHA-256, RIPEMD-160 and RIPEMD-320 with JERIM-320, a 320-bit hash function. The compression functions of hash functions like SHA-1 and SHA-256 are designed using serial successive iteration whereas those like RIPEMD-160 and RIPEMD-320 are designed using two parallel lines of message processing. JERIM-320 uses four parallel lines of message processing resulting in higher level of security than other hash functions at comparable speed and memory requirement. The performance evaluation of these methods has been done by using practical implementation and also by using step computation methods. JERIM-320 proves to be secure and ensures the integrity of messages at a higher degree. The focus of this work is to establish JERIM-320 as an alternative of the present day hash functions for the fast growing internet applications.

Keywords: Cryptography, Hash function, JERIM-320, Messageintegrity

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2582
2024 Some Characteristics of Systolic Arrays

Authors: Halil Snopce, Ilir Spahiu

Abstract:

In this paper is investigated a possible optimization of some linear algebra problems which can be solved by parallel processing using the special arrays called systolic arrays. In this paper are used some special types of transformations for the designing of these arrays. We show the characteristics of these arrays. The main focus is on discussing the advantages of these arrays in parallel computation of matrix product, with special approach to the designing of systolic array for matrix multiplication. Multiplication of large matrices requires a lot of computational time and its complexity is O(n3 ). There are developed many algorithms (both sequential and parallel) with the purpose of minimizing the time of calculations. Systolic arrays are good suited for this purpose. In this paper we show that using an appropriate transformation implicates in finding more optimal arrays for doing the calculations of this type.

Keywords: Data dependences, matrix multiplication, systolicarray, transformation matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1466