Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1130

Search results for: Parallel Computing

1130 Network Based High Performance Computing

Authors: Karanjeet Singh Kahlon, Gurvinder Singh, Arjan Singh

Abstract:

In the past few years there is a change in the view of high performance applications and parallel computing. Initially such applications were targeted towards dedicated parallel machines. Recently trend is changing towards building meta-applications composed of several modules that exploit heterogeneous platforms and employ hybrid forms of parallelism. The aim of this paper is to propose a model of virtual parallel computing. Virtual parallel computing system provides a flexible object oriented software framework that makes it easy for programmers to write various parallel applications.

Keywords: Applet, Efficiency, Java, LAN

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1174
1129 Performance Analysis of Parallel Client-Server Model Versus Parallel Mobile Agent Model

Authors: K. B. Manwade, G. A. Patil

Abstract:

Mobile agent has motivated the creation of a new methodology for parallel computing. We introduce a methodology for the creation of parallel applications on the network. The proposed Mobile-Agent parallel processing framework uses multiple Javamobile Agents. Each mobile agent can travel to the specified machine in the network to perform its tasks. We also introduce the concept of master agent, which is Java object capable of implementing a particular task of the target application. Master agent is dynamically assigns the task to mobile agents. We have developed and tested a prototype application: Mobile Agent Based Parallel Computing. Boosted by the inherited benefits of using Java and Mobile Agents, our proposed methodology breaks the barriers between the environments, and could potentially exploit in a parallel manner all the available computational resources on the network. This paper elaborates performance issues of a mobile agent for parallel computing.

Keywords: Parallel Computing, Mobile Agent.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1322
1128 Applying Autonomic Computing Concepts to Parallel Computing using Intelligent Agents

Authors: Blesson Varghese, Gerard T. McKee

Abstract:

The work reported in this paper is motivated by the fact that there is a need to apply autonomic computing concepts to parallel computing systems. Advancing on prior work based on intelligent cores [36], a swarm-array computing approach, this paper focuses on 'Intelligent agents' another swarm-array computing approach in which the task to be executed on a parallel computing core is considered as a swarm of autonomous agents. A task is carried to a computing core by carrier agents and is seamlessly transferred between cores in the event of a predicted failure, thereby achieving self-ware objectives of autonomic computing. The feasibility of the proposed swarm-array computing approach is validated on a multi-agent simulator.

Keywords: Autonomic computing, intelligent agents, swarm-array computing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1316
1127 Performance Comparison of Parallel Sorting Algorithms on the Cluster of Workstations

Authors: Lai Lai Win Kyi, Nay Min Tun

Abstract:

Sorting appears the most attention among all computational tasks over the past years because sorted data is at the heart of many computations. Sorting is of additional importance to parallel computing because of its close relation to the task of routing data among processes, which is an essential part of many parallel algorithms. Many parallel sorting algorithms have been investigated for a variety of parallel computer architectures. In this paper, three parallel sorting algorithms have been implemented and compared in terms of their overall execution time. The algorithms implemented are the odd-even transposition sort, parallel merge sort and parallel rank sort. Cluster of Workstations or Windows Compute Cluster has been used to compare the algorithms implemented. The C# programming language is used to develop the sorting algorithms. The MPI (Message Passing Interface) library has been selected to establish the communication and synchronization between processors. The time complexity for each parallel sorting algorithm will also be mentioned and analyzed.

Keywords: Cluster of Workstations, Parallel sorting algorithms, performance analysis, parallel computing and MPI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1155
1126 Towards Self-ware via Swarm-Array Computing

Authors: Blesson Varghese, Gerard McKee

Abstract:

The work reported in this paper proposes Swarm-Array computing, a novel technique inspired by swarm robotics, and built on the foundations of autonomic and parallel computing. The approach aims to apply autonomic computing constructs to parallel computing systems and in effect achieve the self-ware objectives that describe self-managing systems. The constitution of swarm-array computing comprising four constituents, namely the computing system, the problem/task, the swarm and the landscape is considered. Approaches that bind these constituents together are proposed. Space applications employing FPGAs are identified as a potential area for applying swarm-array computing for building reliable systems. The feasibility of a proposed approach is validated on the SeSAm multi-agent simulator and landscapes are generated using the MATLAB toolkit.

Keywords: Swarm-Array computing, Autonomic computing, landscapes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1324
1125 A Parallel Quadtree Approach for Image Compression using Wavelets

Authors: Hamed Vahdat Nejad, Hossein Deldari

Abstract:

Wavelet transforms are multiresolution decompositions that can be used to analyze signals and images. Image compression is one of major applications of wavelet transforms in image processing. It is considered as one of the most powerful methods that provides a high compression ratio. However, its implementation is very time-consuming. At the other hand, parallel computing technologies are an efficient method for image compression using wavelets. In this paper, we propose a parallel wavelet compression algorithm based on quadtrees. We implement the algorithm using MatlabMPI (a parallel, message passing version of Matlab), and compute its isoefficiency function, and show that it is scalable. Our experimental results confirm the efficiency of the algorithm also.

Keywords: Image compression, MPI, Parallel computing, Wavelets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1759
1124 A High Performance MPI for Parallel and Distributed Computing

Authors: Prabu D., Vanamala V., Sanjeeb Kumar Deka, Sridharan R., Prahlada Rao B. B., Mohanram N.

Abstract:

Message Passing Interface is widely used for Parallel and Distributed Computing. MPICH and LAM are popular open source MPIs available to the parallel computing community also there are commercial MPIs, which performs better than MPICH etc. In this paper, we discuss a commercial Message Passing Interface, CMPI (C-DAC Message Passing Interface). C-MPI is an optimized MPI for CLUMPS. It is found to be faster and more robust compared to MPICH. We have compared performance of C-MPI and MPICH on Gigabit Ethernet network.

Keywords: C-MPI, C-VIA, HPC, MPICH, P-COMS, PMB

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1291
1123 A Consideration of the Achievement of Productive Level Parallel Programming Skills

Authors: Tadayoshi Horita, Masakazu Akiba, Mina Terauchi, Tsuneo Kanno

Abstract:

This paper gives a consideration of the achievement of productive level parallel programming skills, based on the data of the graduation studies in the Polytechnic University of Japan. The data show that most students can achieve only parallel programming skills during the graduation study (about 600 to 700 hours), if the programming environment is limited to GPGPUs. However, the data also show that it is a very high level task that a student achieves productive level parallel programming skills during only the graduation study. In addition, it shows that the parallel programming environments for GPGPU, such as CUDA and OpenCL, may be more suitable for parallel computing education than other environments such as MPI on a cluster system and Cell.B.E. These results must be useful for the areas of not only software developments, but also hardware product developments using computer technologies.

Keywords: Parallel computing, programming education, GPU, GPGPU, CUDA, OpenCL, MPI, Cell.B.E.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1434
1122 Neural Networks Approaches for Computing the Forward Kinematics of a Redundant Parallel Manipulator

Authors: H. Sadjadian , H.D. Taghirad Member, A. Fatehi

Abstract:

In this paper, different approaches to solve the forward kinematics of a three DOF actuator redundant hydraulic parallel manipulator are presented. On the contrary to series manipulators, the forward kinematic map of parallel manipulators involves highly coupled nonlinear equations, which are almost impossible to solve analytically. The proposed methods are using neural networks identification with different structures to solve the problem. The accuracy of the results of each method is analyzed in detail and the advantages and the disadvantages of them in computing the forward kinematic map of the given mechanism is discussed in detail. It is concluded that ANFIS presents the best performance compared to MLP, RBF and PNN networks in this particular application.

Keywords: Forward Kinematics, Neural Networks, Numerical Solution, Parallel Manipulators.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1623
1121 Experimental Parallel Architecture for Rendering 3D Model into MPEG-4 Format

Authors: Ajay Joshi, Surya Ismail

Abstract:

This paper will present the initial findings of a research into distributed computer rendering. The goal of the research is to create a distributed computer system capable of rendering a 3D model into an MPEG-4 stream. This paper outlines the initial design, software architecture and hardware setup for the system. Distributed computing means designing and implementing programs that run on two or more interconnected computing systems. Distributed computing is often used to speed up the rendering of graphical imaging. Distributed computing systems are used to generate images for movies, games and simulations. A topic of interest is the application of distributed computing to the MPEG-4 standard. During the course of the research, a distributed system will be created that can render a 3D model into an MPEG-4 stream. It is expected that applying distributed computing principals will speed up rendering, thus improving the usefulness and efficiency of the MPEG-4 standard

Keywords: Cluster, parallel architecture, rendering, MPEG-4.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1238
1120 Using Cloud Computing for E-Government: Challenges and Benefits

Authors: Sajjad Hashemi, Khalil Monfaredi, Mohammad Masdari

Abstract:

Cloud computing is a style of computing which is formed from the aggregation and development of technologies such as grid computing distributed computing, parallel computing and service-oriented architecture. And its aim is to provide computing, communication and storage resources in a safe environment based on service, as fast as possible, which is virtually provided via Internet platform. Considering that the provided Services in e-government are available via the Internet, thus cloud computing can be used in the implementation of e-government architecture and provide better service with the lowest economic cost using its benefits. In this paper, the Methods of using cloud computing in e-government has been studied and it's been attempted to identify the challenges and benefits of the cloud to get used in the e-government and proposals have been offered to overcome its shortcomings, encourage and partnership of governments and people to use this economical and new technology.

Keywords: Benefits, Cloud computing, Committee, Challenges, E-Government, Participation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9418
1119 Solving Facility Location Problem on Cluster Computing

Authors: Ei Phyo Wai, Nay Min Tun

Abstract:

Computation of facility location problem for every location in the country is not easy simultaneously. Solving the problem is described by using cluster computing. A technique is to design parallel algorithm by using local search with single swap method in order to solve that problem on clusters. Parallel implementation is done by the use of portable parallel programming, Message Passing Interface (MPI), on Microsoft Windows Compute Cluster. In this paper, it presents the algorithm that used local search with single swap method and implementation of the system of a facility to be opened by using MPI on cluster. If large datasets are considered, the process of calculating a reasonable cost for a facility becomes time consuming. The result shows parallel computation of facility location problem on cluster speedups and scales well as problem size increases.

Keywords: cluster, cost, demand, facility location

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1173
1118 Collision Detection Algorithm Based on Data Parallelism

Authors: Zhen Peng, Baifeng Wu

Abstract:

Modern computing technology enters the era of parallel computing with the trend of sustainable and scalable parallelism. Single Instruction Multiple Data (SIMD) is an important way to go along with the trend. It is able to gather more and more computing ability by increasing the number of processor cores without the need of modifying the program. Meanwhile, in the field of scientific computing and engineering design, many computation intensive applications are facing the challenge of increasingly large amount of data. Data parallel computing will be an important way to further improve the performance of these applications. In this paper, we take the accurate collision detection in building information modeling as an example. We demonstrate a model for constructing a data parallel algorithm. According to the model, a complex object is decomposed into the sets of simple objects; collision detection among complex objects is converted into those among simple objects. The resulting algorithm is a typical SIMD algorithm, and its advantages in parallelism and scalability is unparalleled in respect to the traditional algorithms.

Keywords: Data parallelism, collision detection, single instruction multiple data, building information modeling, continuous scalability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 882
1117 Design and Implementation of Shared Memory based Parallel File System Logging Method for High Performance Computing

Authors: Hyeyoung Cho, Sungho Kim, SangDong Lee

Abstract:

I/O workload is a critical and important factor to analyze I/O pattern and file system performance. However tracing I/O operations on the fly distributed parallel file system is non-trivial due to collection overhead and a large volume of data. In this paper, we design and implement a parallel file system logging method for high performance computing using shared memory-based multi-layer scheme. It minimizes the overhead with reduced logging operation response time and provides efficient post-processing scheme through shared memory. Separated logging server can collect sequential logs from multiple clients in a cluster through packet communication. Implementation and evaluation result shows low overhead and high scalability of this architecture for high performance parallel logging analysis.

Keywords: I/O workload, PVFS, I/O Trace.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1320
1116 A Simulation Software for DNA Computing Algorithms Implementation

Authors: M. S. Muhammad, S. M. W. Masra, K. Kipli, N. Zamhari

Abstract:

The capturing of gel electrophoresis image represents the output of a DNA computing algorithm. Before this image is being captured, DNA computing involves parallel overlap assembly (POA) and polymerase chain reaction (PCR) that is the main of this computing algorithm. However, the design of the DNA oligonucleotides to represent a problem is quite complicated and is prone to errors. In order to reduce these errors during the design stage before the actual in-vitro experiment is carried out; a simulation software capable of simulating the POA and PCR processes is developed. This simulation software capability is unlimited where problem of any size and complexity can be simulated, thus saving cost due to possible errors during the design process. Information regarding the DNA sequence during the computing process as well as the computing output can be extracted at the same time using the simulation software.

Keywords: DNA computing, PCR, POA, simulation software

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1400
1115 Parallel-computing Approach for FFT Implementation on Digital Signal Processor (DSP)

Authors: Yi-Pin Hsu, Shin-Yu Lin

Abstract:

An efficient parallel form in digital signal processor can improve the algorithm performance. The butterfly structure is an important role in fast Fourier transform (FFT), because its symmetry form is suitable for hardware implementation. Although it can perform a symmetric structure, the performance will be reduced under the data-dependent flow characteristic. Even though recent research which call as novel memory reference reduction methods (NMRRM) for FFT focus on reduce memory reference in twiddle factor, the data-dependent property still exists. In this paper, we propose a parallel-computing approach for FFT implementation on digital signal processor (DSP) which is based on data-independent property and still hold the property of low-memory reference. The proposed method combines final two steps in NMRRM FFT to perform a novel data-independent structure, besides it is very suitable for multi-operation-unit digital signal processor and dual-core system. We have applied the proposed method of radix-2 FFT algorithm in low memory reference on TI TMSC320C64x DSP. Experimental results show the method can reduce 33.8% clock cycles comparing with the NMRRM FFT implementation and keep the low-memory reference property.

Keywords: Parallel-computing, FFT, low-memory reference, TIDSP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1954
1114 Performance Evaluation of Parallel Surface Modeling and Generation on Actual and Virtual Multicore Systems

Authors: Nyeng P. Gyang

Abstract:

Even though past, current and future trends suggest that multicore and cloud computing systems are increasingly prevalent/ubiquitous, this class of parallel systems is nonetheless underutilized, in general, and barely used for research on employing parallel Delaunay triangulation for parallel surface modeling and generation, in particular. The performances, of actual/physical and virtual/cloud multicore systems/machines, at executing various algorithms, which implement various parallelization strategies of the incremental insertion technique of the Delaunay triangulation algorithm, were evaluated. T-tests were run on the data collected, in order to determine whether various performance metrics differences (including execution time, speedup and efficiency) were statistically significant. Results show that the actual machine is approximately twice faster than the virtual machine at executing the same programs for the various parallelization strategies. Results, which furnish the scalability behaviors of the various parallelization strategies, also show that some of the differences between the performances of these systems, during different runs of the algorithms on the systems, were statistically significant. A few pseudo superlinear speedup results, which were computed from the raw data collected, are not true superlinear speedup values. These pseudo superlinear speedup values, which arise as a result of one way of computing speedups, disappear and give way to asymmetric speedups, which are the accurate kind of speedups that occur in the experiments performed.

Keywords: Cloud computing systems, multicore systems, parallel delaunay triangulation, parallel surface modeling and generation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 612
1113 Statistical Reliability Based Modeling of Series and Parallel Operating Systems using Extreme Value Theory

Authors: Mohamad Mahdavi, Mojtaba Mahdavi

Abstract:

This paper tries to represent a new method for computing the reliability of a system which is arranged in series or parallel model. In this method we estimate life distribution function of whole structure using the asymptotic Extreme Value (EV) distribution of Type I, or Gumbel theory. We use EV distribution in minimal mode, for estimate the life distribution function of series structure and maximal mode for parallel system. All parameters also are estimated by Moments method. Reliability function and failure (hazard) rate and p-th percentile point of each function are determined. Other important indexes such as Mean Time to Failure (MTTF), Mean Time to repair (MTTR), for non-repairable and renewal systems in both of series and parallel structure will be computed.

Keywords: Reliability, extreme value, parallel, series, lifedistribution

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1776
1112 Concurrent Approach to Data Parallel Model using Java

Authors: Bala Dhandayuthapani Veerasamy

Abstract:

Parallel programming models exist as an abstraction of hardware and memory architectures. There are several parallel programming models in commonly use; they are shared memory model, thread model, message passing model, data parallel model, hybrid model, Flynn-s models, embarrassingly parallel computations model, pipelined computations model. These models are not specific to a particular type of machine or memory architecture. This paper expresses the model program for concurrent approach to data parallel model through java programming.

Keywords: Concurrent, Data Parallel, JDK, Parallel, Thread

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1601
1111 Damage Strain Analysis of Parallel Fiber Eutectic

Authors: Jian Zheng, Xinhua Ni, Xiequan Liu

Abstract:

According to isotropy of parallel fiber eutectic, the no- damage strain field in parallel fiber eutectic is obtained from the flexibility tensor of parallel fiber eutectic. Considering the damage behavior of parallel fiber eutectic, damage variables are introduced to determine the strain field of parallel fiber eutectic. The damage strains in the matrix, interphase, and fiber of parallel fiber eutectic are quantitatively analyzed. Results show that damage strains are not only associated with the fiber volume fraction of parallel fiber eutectic, but also with the damage degree.

Keywords: Parallel fiber eutectic, no-damage strain, damage strain, fiber volume fraction, damage degree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 568
1110 Performance Analysis of List Scheduling in Heterogeneous Computing Systems

Authors: Keqin Li

Abstract:

Given a parallel program to be executed on a heterogeneous computing system, the overall execution time of the program is determined by a schedule. In this paper, we analyze the worst-case performance of the list scheduling algorithm for scheduling tasks of a parallel program in a mixed-machine heterogeneous computing system such that the total execution time of the program is minimized. We prove tight lower and upper bounds for the worst-case performance ratio of the list scheduling algorithm. We also examine the average-case performance of the list scheduling algorithm. Our experimental data reveal that the average-case performance of the list scheduling algorithm is much better than the worst-case performance and is very close to optimal, except for large systems with large heterogeneity. Thus, the list scheduling algorithm is very useful in real applications.

Keywords: Average-case performance, list scheduling algorithm, mixed-machine heterogeneous computing system, worst-case performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1097
1109 Enabling Automated Deployment for Cluster Computing in Distributed PC Classrooms

Authors: Shuen-Tai Wang, Ying-Chuan Chen, Hsi-Ya Chang

Abstract:

The rapid improvement of the microprocessor and network has made it possible for the PC cluster to compete with conventional supercomputers. Lots of high throughput type of applications can be satisfied by using the current desktop PCs, especially for those in PC classrooms, and leave the supercomputers for the demands from large scale high performance parallel computations. This paper presents our development on enabling an automated deployment mechanism for cluster computing to utilize the computing power of PCs such as reside in PC classroom. After well deployment, these PCs can be transformed into a pre-configured cluster computing resource immediately without touching the existing education/training environment installed on these PCs. Thus, the training activities will not be affected by this additional activity to harvest idle computing cycles. The time and manpower required to build and manage a computing platform in geographically distributed PC classrooms also can be reduced by this development.

Keywords: PC cluster, automated deployment, cluster computing, PC classroom.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1270
1108 Application and Limitation of Parallel Modelingin Multidimensional Sequential Pattern

Authors: Mahdi Esmaeili, Mansour Tarafdar

Abstract:

The goal of data mining algorithms is to discover useful information embedded in large databases. One of the most important data mining problems is discovery of frequently occurring patterns in sequential data. In a multidimensional sequence each event depends on more than one dimension. The search space is quite large and the serial algorithms are not scalable for very large datasets. To address this, it is necessary to study scalable parallel implementations of sequence mining algorithms. In this paper, we present a model for multidimensional sequence and describe a parallel algorithm based on data parallelism. Simulation experiments show good load balancing and scalable and acceptable speedup over different processors and problem sizes and demonstrate that our approach can works efficiently in a real parallel computing environment.

Keywords: Sequential Patterns, Data Mining, ParallelAlgorithm, Multidimensional Sequence Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1121
1107 High Performance in Parallel Data Integration: An Empirical Evaluation of the Ratio Between Processing Time and Number of Physical Nodes

Authors: Caspar von Seckendorff, Eldar Sultanow

Abstract:

Many studies have shown that parallelization decreases efficiency [1], [2]. There are many reasons for these decrements. This paper investigates those which appear in the context of parallel data integration. Integration processes generally cannot be allocated to packages of identical size (i. e. tasks of identical complexity). The reason for this is unknown heterogeneous input data which result in variable task lengths. Process delay is defined by the slowest processing node. It leads to a detrimental effect on the total processing time. With a real world example, this study will show that while process delay does initially increase with the introduction of more nodes it ultimately decreases again after a certain point. The example will make use of the cloud computing platform Hadoop and be run inside Amazon-s EC2 compute cloud. A stochastic model will be set up which can explain this effect.

Keywords: Process delay, speedup, efficiency, parallel computing, data integration, E-Commerce, Amazon Elastic Compute Cloud (EC2), Hadoop, Nutch.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1378
1106 Detecting the Edge of Multiple Images in Parallel

Authors: Prakash K. Aithal, U. Dinesh Acharya, Rajesh Gopakumar

Abstract:

Edge is variation of brightness in an image. Edge detection is useful in many application areas such as finding forests, rivers from a satellite image, detecting broken bone in a medical image etc. The paper discusses about finding edge of multiple aerial images in parallel. The proposed work tested on 38 images 37 colored and one monochrome image. The time taken to process N images in parallel is equivalent to time taken to process 1 image in sequential. Message Passing Interface (MPI) and Open Computing Language (OpenCL) is used to achieve task and pixel level parallelism respectively.

Keywords: Edge detection, multicore, GPU, openCL, MPI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1962
1105 Implementation of Parallel Interface for Microprocessor Trainer

Authors: Moe Moe Htun, Khin Htar Nwe

Abstract:

In this paper, parallel interface for microprocessor trainer was implemented. A programmable parallel–port device such as the IC 8255A is initialized for simple input or output and for handshake input or output by choosing kinds of modes. The hardware connections and the programs can be used to interface microprocessor trainer and a personal computer by using IC 8255A. The assembly programs edited on PC-s editor can be downloaded to the trainer.

Keywords: Parallel I/O ports, parallel interface, trainer, two 8255 ICs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2919
1104 Strip Decomposition Parallelization of Fast Direct Poisson Solver on a 3D Cartesian Staggered Grid

Authors: Minh Vuong Pham, Frédéric Plourde, Son Doan Kim

Abstract:

A strip domain decomposition parallel algorithm for fast direct Poisson solver is presented on a 3D Cartesian staggered grid. The parallel algorithm follows the principles of sequential algorithm for fast direct Poisson solver. Both Dirichlet and Neumann boundary conditions are addressed. Several test cases are likewise addressed in order to shed light on accuracy and efficiency in the strip domain parallelization algorithm. Actually the current implementation shows a very high efficiency when dealing with a large grid mesh up to 3.6 * 109 under massive parallel approach, which explicitly demonstrates that the proposed algorithm is ready for massive parallel computing.

Keywords: Strip-decomposition, parallelization, fast directpoisson solver.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806
1103 Parallelization of Ensemble Kalman Filter (EnKF) for Oil Reservoirs with Time-lapse Seismic Data

Authors: Md Khairullah, Hai-Xiang Lin, Remus G. Hanea, Arnold W. Heemink

Abstract:

In this paper we describe the design and implementation of a parallel algorithm for data assimilation with ensemble Kalman filter (EnKF) for oil reservoir history matching problem. The use of large number of observations from time-lapse seismic leads to a large turnaround time for the analysis step, in addition to the time consuming simulations of the realizations. For efficient parallelization it is important to consider parallel computation at the analysis step. Our experiments show that parallelization of the analysis step in addition to the forecast step has good scalability, exploiting the same set of resources with some additional efforts.

Keywords: EnKF, Data assimilation, Parallel computing, Parallel efficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1964
1102 Efficient Implementation of Serial and Parallel Support Vector Machine Training with a Multi-Parameter Kernel for Large-Scale Data Mining

Authors: Tatjana Eitrich, Bruno Lang

Abstract:

This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.

Keywords: Support Vector Machines, Shared Memory Parallel Computing, Large Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1345
1101 A Multi-Level WEB Based Parallel Processing System A Hierarchical Volunteer Computing Approach

Authors: Abdelrahman Ahmed Mohamed Osman

Abstract:

Over the past few years, a number of efforts have been exerted to build parallel processing systems that utilize the idle power of LAN-s and PC-s available in many homes and corporations. The main advantage of these approaches is that they provide cheap parallel processing environments for those who cannot afford the expenses of supercomputers and parallel processing hardware. However, most of the solutions provided are not very flexible in the use of available resources and very difficult to install and setup. In this paper, a multi-level web-based parallel processing system (MWPS) is designed (appendix). MWPS is based on the idea of volunteer computing, very flexible, easy to setup and easy to use. MWPS allows three types of subscribers: simple volunteers (single computers), super volunteers (full networks) and end users. All of these entities are coordinated transparently through a secure web site. Volunteer nodes provide the required processing power needed by the system end users. There is no limit on the number of volunteer nodes, and accordingly the system can grow indefinitely. Both volunteer and system users must register and subscribe. Once, they subscribe, each entity is provided with the appropriate MWPS components. These components are very easy to install. Super volunteer nodes are provided with special components that make it possible to delegate some of the load to their inner nodes. These inner nodes may also delegate some of the load to some other lower level inner nodes .... and so on. It is the responsibility of the parent super nodes to coordinate the delegation process and deliver the results back to the user. MWPS uses a simple behavior-based scheduler that takes into consideration the current load and previous behavior of processing nodes. Nodes that fulfill their contracts within the expected time get a high degree of trust. Nodes that fail to satisfy their contract get a lower degree of trust. MWPS is based on the .NET framework and provides the minimal level of security expected in distributed processing environments. Users and processing nodes are fully authenticated. Communications and messages between nodes are very secure. The system has been implemented using C#. MWPS may be used by any group of people or companies to establish a parallel processing or grid environment.

Keywords: Volunteer computing, Parallel Processing, XMLWebServices, .NET Remoting, Tuplespace.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1180