Search results for: Parallel computing
1158 Network Based High Performance Computing
Authors: Karanjeet Singh Kahlon, Gurvinder Singh, Arjan Singh
In the past few years there is a change in the view of high performance applications and parallel computing. Initially such applications were targeted towards dedicated parallel machines. Recently trend is changing towards building meta-applications composed of several modules that exploit heterogeneous platforms and employ hybrid forms of parallelism. The aim of this paper is to propose a model of virtual parallel computing. Virtual parallel computing system provides a flexible object oriented software framework that makes it easy for programmers to write various parallel applications.
Keywords: Applet, Efficiency, Java, LANProcedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1334
1157 Performance Analysis of Parallel Client-Server Model Versus Parallel Mobile Agent Model
Authors: K. B. Manwade, G. A. Patil
Abstract:Mobile agent has motivated the creation of a new methodology for parallel computing. We introduce a methodology for the creation of parallel applications on the network. The proposed Mobile-Agent parallel processing framework uses multiple Javamobile Agents. Each mobile agent can travel to the specified machine in the network to perform its tasks. We also introduce the concept of master agent, which is Java object capable of implementing a particular task of the target application. Master agent is dynamically assigns the task to mobile agents. We have developed and tested a prototype application: Mobile Agent Based Parallel Computing. Boosted by the inherited benefits of using Java and Mobile Agents, our proposed methodology breaks the barriers between the environments, and could potentially exploit in a parallel manner all the available computational resources on the network. This paper elaborates performance issues of a mobile agent for parallel computing.
Keywords: Parallel Computing, Mobile Agent.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1513
1156 Applying Autonomic Computing Concepts to Parallel Computing using Intelligent Agents
Authors: Blesson Varghese, Gerard T. McKee
The work reported in this paper is motivated by the fact that there is a need to apply autonomic computing concepts to parallel computing systems. Advancing on prior work based on intelligent cores , a swarm-array computing approach, this paper focuses on 'Intelligent agents' another swarm-array computing approach in which the task to be executed on a parallel computing core is considered as a swarm of autonomous agents. A task is carried to a computing core by carrier agents and is seamlessly transferred between cores in the event of a predicted failure, thereby achieving self-ware objectives of autonomic computing. The feasibility of the proposed swarm-array computing approach is validated on a multi-agent simulator.
Keywords: Autonomic computing, intelligent agents, swarm-array computing.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1474
1155 Performance Comparison of Parallel Sorting Algorithms on the Cluster of Workstations
Authors: Lai Lai Win Kyi, Nay Min Tun
Sorting appears the most attention among all computational tasks over the past years because sorted data is at the heart of many computations. Sorting is of additional importance to parallel computing because of its close relation to the task of routing data among processes, which is an essential part of many parallel algorithms. Many parallel sorting algorithms have been investigated for a variety of parallel computer architectures. In this paper, three parallel sorting algorithms have been implemented and compared in terms of their overall execution time. The algorithms implemented are the odd-even transposition sort, parallel merge sort and parallel rank sort. Cluster of Workstations or Windows Compute Cluster has been used to compare the algorithms implemented. The C# programming language is used to develop the sorting algorithms. The MPI (Message Passing Interface) library has been selected to establish the communication and synchronization between processors. The time complexity for each parallel sorting algorithm will also be mentioned and analyzed.
Keywords: Cluster of Workstations, Parallel sorting algorithms, performance analysis, parallel computing and MPI.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1358
1154 Towards Self-ware via Swarm-Array Computing
Authors: Blesson Varghese, Gerard McKee
Abstract:The work reported in this paper proposes Swarm-Array computing, a novel technique inspired by swarm robotics, and built on the foundations of autonomic and parallel computing. The approach aims to apply autonomic computing constructs to parallel computing systems and in effect achieve the self-ware objectives that describe self-managing systems. The constitution of swarm-array computing comprising four constituents, namely the computing system, the problem/task, the swarm and the landscape is considered. Approaches that bind these constituents together are proposed. Space applications employing FPGAs are identified as a potential area for applying swarm-array computing for building reliable systems. The feasibility of a proposed approach is validated on the SeSAm multi-agent simulator and landscapes are generated using the MATLAB toolkit.
Keywords: Swarm-Array computing, Autonomic computing, landscapes.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1491
1153 A Parallel Quadtree Approach for Image Compression using Wavelets
Authors: Hamed Vahdat Nejad, Hossein Deldari
Abstract:Wavelet transforms are multiresolution decompositions that can be used to analyze signals and images. Image compression is one of major applications of wavelet transforms in image processing. It is considered as one of the most powerful methods that provides a high compression ratio. However, its implementation is very time-consuming. At the other hand, parallel computing technologies are an efficient method for image compression using wavelets. In this paper, we propose a parallel wavelet compression algorithm based on quadtrees. We implement the algorithm using MatlabMPI (a parallel, message passing version of Matlab), and compute its isoefficiency function, and show that it is scalable. Our experimental results confirm the efficiency of the algorithm also.
Keywords: Image compression, MPI, Parallel computing, Wavelets.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1915
1152 A High Performance MPI for Parallel and Distributed Computing
Authors: Prabu D., Vanamala V., Sanjeeb Kumar Deka, Sridharan R., Prahlada Rao B. B., Mohanram N.
Abstract:Message Passing Interface is widely used for Parallel and Distributed Computing. MPICH and LAM are popular open source MPIs available to the parallel computing community also there are commercial MPIs, which performs better than MPICH etc. In this paper, we discuss a commercial Message Passing Interface, CMPI (C-DAC Message Passing Interface). C-MPI is an optimized MPI for CLUMPS. It is found to be faster and more robust compared to MPICH. We have compared performance of C-MPI and MPICH on Gigabit Ethernet network.
Keywords: C-MPI, C-VIA, HPC, MPICH, P-COMS, PMBProcedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1446
1151 A Consideration of the Achievement of Productive Level Parallel Programming Skills
Authors: Tadayoshi Horita, Masakazu Akiba, Mina Terauchi, Tsuneo Kanno
This paper gives a consideration of the achievement of productive level parallel programming skills, based on the data of the graduation studies in the Polytechnic University of Japan. The data show that most students can achieve only parallel programming skills during the graduation study (about 600 to 700 hours), if the programming environment is limited to GPGPUs. However, the data also show that it is a very high level task that a student achieves productive level parallel programming skills during only the graduation study. In addition, it shows that the parallel programming environments for GPGPU, such as CUDA and OpenCL, may be more suitable for parallel computing education than other environments such as MPI on a cluster system and Cell.B.E. These results must be useful for the areas of not only software developments, but also hardware product developments using computer technologies.
Keywords: Parallel computing, programming education, GPU, GPGPU, CUDA, OpenCL, MPI, Cell.B.E.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1575
1150 Neural Networks Approaches for Computing the Forward Kinematics of a Redundant Parallel Manipulator
Authors: H. Sadjadian , H.D. Taghirad Member, A. Fatehi
Abstract:In this paper, different approaches to solve the forward kinematics of a three DOF actuator redundant hydraulic parallel manipulator are presented. On the contrary to series manipulators, the forward kinematic map of parallel manipulators involves highly coupled nonlinear equations, which are almost impossible to solve analytically. The proposed methods are using neural networks identification with different structures to solve the problem. The accuracy of the results of each method is analyzed in detail and the advantages and the disadvantages of them in computing the forward kinematic map of the given mechanism is discussed in detail. It is concluded that ANFIS presents the best performance compared to MLP, RBF and PNN networks in this particular application.
Keywords: Forward Kinematics, Neural Networks, Numerical Solution, Parallel Manipulators.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1828
1149 Experimental Parallel Architecture for Rendering 3D Model into MPEG-4 Format
Authors: Ajay Joshi, Surya Ismail
Abstract:This paper will present the initial findings of a research into distributed computer rendering. The goal of the research is to create a distributed computer system capable of rendering a 3D model into an MPEG-4 stream. This paper outlines the initial design, software architecture and hardware setup for the system. Distributed computing means designing and implementing programs that run on two or more interconnected computing systems. Distributed computing is often used to speed up the rendering of graphical imaging. Distributed computing systems are used to generate images for movies, games and simulations. A topic of interest is the application of distributed computing to the MPEG-4 standard. During the course of the research, a distributed system will be created that can render a 3D model into an MPEG-4 stream. It is expected that applying distributed computing principals will speed up rendering, thus improving the usefulness and efficiency of the MPEG-4 standard
Keywords: Cluster, parallel architecture, rendering, MPEG-4.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1357
1148 Solving Facility Location Problem on Cluster Computing
Authors: Ei Phyo Wai, Nay Min Tun
Abstract:Computation of facility location problem for every location in the country is not easy simultaneously. Solving the problem is described by using cluster computing. A technique is to design parallel algorithm by using local search with single swap method in order to solve that problem on clusters. Parallel implementation is done by the use of portable parallel programming, Message Passing Interface (MPI), on Microsoft Windows Compute Cluster. In this paper, it presents the algorithm that used local search with single swap method and implementation of the system of a facility to be opened by using MPI on cluster. If large datasets are considered, the process of calculating a reasonable cost for a facility becomes time consuming. The result shows parallel computation of facility location problem on cluster speedups and scales well as problem size increases.
Keywords: cluster, cost, demand, facility locationProcedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1337
1147 Collision Detection Algorithm Based on Data Parallelism
Authors: Zhen Peng, Baifeng Wu
Modern computing technology enters the era of parallel computing with the trend of sustainable and scalable parallelism. Single Instruction Multiple Data (SIMD) is an important way to go along with the trend. It is able to gather more and more computing ability by increasing the number of processor cores without the need of modifying the program. Meanwhile, in the field of scientific computing and engineering design, many computation intensive applications are facing the challenge of increasingly large amount of data. Data parallel computing will be an important way to further improve the performance of these applications. In this paper, we take the accurate collision detection in building information modeling as an example. We demonstrate a model for constructing a data parallel algorithm. According to the model, a complex object is decomposed into the sets of simple objects; collision detection among complex objects is converted into those among simple objects. The resulting algorithm is a typical SIMD algorithm, and its advantages in parallelism and scalability is unparalleled in respect to the traditional algorithms.
Keywords: Data parallelism, collision detection, single instruction multiple data, building information modeling, continuous scalability.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1089
1146 Using Cloud Computing for E-Government: Challenges and Benefits
Authors: Sajjad Hashemi, Khalil Monfaredi, Mohammad Masdari
Cloud computing is a style of computing which is formed from the aggregation and development of technologies such as grid computing distributed computing, parallel computing and service-oriented architecture. And its aim is to provide computing, communication and storage resources in a safe environment based on service, as fast as possible, which is virtually provided via Internet platform. Considering that the provided Services in e-government are available via the Internet, thus cloud computing can be used in the implementation of e-government architecture and provide better service with the lowest economic cost using its benefits. In this paper, the Methods of using cloud computing in e-government has been studied and it's been attempted to identify the challenges and benefits of the cloud to get used in the e-government and proposals have been offered to overcome its shortcomings, encourage and partnership of governments and people to use this economical and new technology.
Keywords: Benefits, Cloud computing, Committee, Challenges, E-Government, Participation.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9718
1145 Design and Implementation of Shared Memory based Parallel File System Logging Method for High Performance Computing
Authors: Hyeyoung Cho, Sungho Kim, SangDong Lee
Abstract:I/O workload is a critical and important factor to analyze I/O pattern and file system performance. However tracing I/O operations on the fly distributed parallel file system is non-trivial due to collection overhead and a large volume of data. In this paper, we design and implement a parallel file system logging method for high performance computing using shared memory-based multi-layer scheme. It minimizes the overhead with reduced logging operation response time and provides efficient post-processing scheme through shared memory. Separated logging server can collect sequential logs from multiple clients in a cluster through packet communication. Implementation and evaluation result shows low overhead and high scalability of this architecture for high performance parallel logging analysis.
Keywords: I/O workload, PVFS, I/O Trace.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1453
1144 A Simulation Software for DNA Computing Algorithms Implementation
Authors: M. S. Muhammad, S. M. W. Masra, K. Kipli, N. Zamhari
Abstract:The capturing of gel electrophoresis image represents the output of a DNA computing algorithm. Before this image is being captured, DNA computing involves parallel overlap assembly (POA) and polymerase chain reaction (PCR) that is the main of this computing algorithm. However, the design of the DNA oligonucleotides to represent a problem is quite complicated and is prone to errors. In order to reduce these errors during the design stage before the actual in-vitro experiment is carried out; a simulation software capable of simulating the POA and PCR processes is developed. This simulation software capability is unlimited where problem of any size and complexity can be simulated, thus saving cost due to possible errors during the design process. Information regarding the DNA sequence during the computing process as well as the computing output can be extracted at the same time using the simulation software.
Keywords: DNA computing, PCR, POA, simulation softwareProcedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1662
1143 Parallel-computing Approach for FFT Implementation on Digital Signal Processor (DSP)
Authors: Yi-Pin Hsu, Shin-Yu Lin
An efficient parallel form in digital signal processor can improve the algorithm performance. The butterfly structure is an important role in fast Fourier transform (FFT), because its symmetry form is suitable for hardware implementation. Although it can perform a symmetric structure, the performance will be reduced under the data-dependent flow characteristic. Even though recent research which call as novel memory reference reduction methods (NMRRM) for FFT focus on reduce memory reference in twiddle factor, the data-dependent property still exists. In this paper, we propose a parallel-computing approach for FFT implementation on digital signal processor (DSP) which is based on data-independent property and still hold the property of low-memory reference. The proposed method combines final two steps in NMRRM FFT to perform a novel data-independent structure, besides it is very suitable for multi-operation-unit digital signal processor and dual-core system. We have applied the proposed method of radix-2 FFT algorithm in low memory reference on TI TMSC320C64x DSP. Experimental results show the method can reduce 33.8% clock cycles comparing with the NMRRM FFT implementation and keep the low-memory reference property.
Keywords: Parallel-computing, FFT, low-memory reference, TIDSP.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2082
1142 Performance Evaluation of Parallel Surface Modeling and Generation on Actual and Virtual Multicore Systems
Authors: Nyeng P. Gyang
Abstract:Even though past, current and future trends suggest that multicore and cloud computing systems are increasingly prevalent/ubiquitous, this class of parallel systems is nonetheless underutilized, in general, and barely used for research on employing parallel Delaunay triangulation for parallel surface modeling and generation, in particular. The performances, of actual/physical and virtual/cloud multicore systems/machines, at executing various algorithms, which implement various parallelization strategies of the incremental insertion technique of the Delaunay triangulation algorithm, were evaluated. T-tests were run on the data collected, in order to determine whether various performance metrics differences (including execution time, speedup and efficiency) were statistically significant. Results show that the actual machine is approximately twice faster than the virtual machine at executing the same programs for the various parallelization strategies. Results, which furnish the scalability behaviors of the various parallelization strategies, also show that some of the differences between the performances of these systems, during different runs of the algorithms on the systems, were statistically significant. A few pseudo superlinear speedup results, which were computed from the raw data collected, are not true superlinear speedup values. These pseudo superlinear speedup values, which arise as a result of one way of computing speedups, disappear and give way to asymmetric speedups, which are the accurate kind of speedups that occur in the experiments performed.
Keywords: Cloud computing systems, multicore systems, parallel delaunay triangulation, parallel surface modeling and generation.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 765
1141 Concurrent Approach to Data Parallel Model using Java
Authors: Bala Dhandayuthapani Veerasamy
Abstract:Parallel programming models exist as an abstraction of hardware and memory architectures. There are several parallel programming models in commonly use; they are shared memory model, thread model, message passing model, data parallel model, hybrid model, Flynn-s models, embarrassingly parallel computations model, pipelined computations model. These models are not specific to a particular type of machine or memory architecture. This paper expresses the model program for concurrent approach to data parallel model through java programming.
Keywords: Concurrent, Data Parallel, JDK, Parallel, ThreadProcedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1808
1140 Damage Strain Analysis of Parallel Fiber Eutectic
Authors: Jian Zheng, Xinhua Ni, Xiequan Liu
According to isotropy of parallel fiber eutectic, the no- damage strain field in parallel fiber eutectic is obtained from the flexibility tensor of parallel fiber eutectic. Considering the damage behavior of parallel fiber eutectic, damage variables are introduced to determine the strain field of parallel fiber eutectic. The damage strains in the matrix, interphase, and fiber of parallel fiber eutectic are quantitatively analyzed. Results show that damage strains are not only associated with the fiber volume fraction of parallel fiber eutectic, but also with the damage degree.
Keywords: Parallel fiber eutectic, no-damage strain, damage strain, fiber volume fraction, damage degree.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 711
1139 Statistical Reliability Based Modeling of Series and Parallel Operating Systems using Extreme Value Theory
Authors: Mohamad Mahdavi, Mojtaba Mahdavi
Abstract:This paper tries to represent a new method for computing the reliability of a system which is arranged in series or parallel model. In this method we estimate life distribution function of whole structure using the asymptotic Extreme Value (EV) distribution of Type I, or Gumbel theory. We use EV distribution in minimal mode, for estimate the life distribution function of series structure and maximal mode for parallel system. All parameters also are estimated by Moments method. Reliability function and failure (hazard) rate and p-th percentile point of each function are determined. Other important indexes such as Mean Time to Failure (MTTF), Mean Time to repair (MTTR), for non-repairable and renewal systems in both of series and parallel structure will be computed.
Keywords: Reliability, extreme value, parallel, series, lifedistributionProcedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1966
1138 Performance Analysis of List Scheduling in Heterogeneous Computing Systems
Authors: Keqin Li
Abstract:Given a parallel program to be executed on a heterogeneous computing system, the overall execution time of the program is determined by a schedule. In this paper, we analyze the worst-case performance of the list scheduling algorithm for scheduling tasks of a parallel program in a mixed-machine heterogeneous computing system such that the total execution time of the program is minimized. We prove tight lower and upper bounds for the worst-case performance ratio of the list scheduling algorithm. We also examine the average-case performance of the list scheduling algorithm. Our experimental data reveal that the average-case performance of the list scheduling algorithm is much better than the worst-case performance and is very close to optimal, except for large systems with large heterogeneity. Thus, the list scheduling algorithm is very useful in real applications.
Keywords: Average-case performance, list scheduling algorithm, mixed-machine heterogeneous computing system, worst-case performance.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1234
1137 Enabling Automated Deployment for Cluster Computing in Distributed PC Classrooms
Authors: Shuen-Tai Wang, Ying-Chuan Chen, Hsi-Ya Chang
The rapid improvement of the microprocessor and network has made it possible for the PC cluster to compete with conventional supercomputers. Lots of high throughput type of applications can be satisfied by using the current desktop PCs, especially for those in PC classrooms, and leave the supercomputers for the demands from large scale high performance parallel computations. This paper presents our development on enabling an automated deployment mechanism for cluster computing to utilize the computing power of PCs such as reside in PC classroom. After well deployment, these PCs can be transformed into a pre-configured cluster computing resource immediately without touching the existing education/training environment installed on these PCs. Thus, the training activities will not be affected by this additional activity to harvest idle computing cycles. The time and manpower required to build and manage a computing platform in geographically distributed PC classrooms also can be reduced by this development.
Keywords: PC cluster, automated deployment, cluster computing, PC classroom.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1421
1136 Application and Limitation of Parallel Modelingin Multidimensional Sequential Pattern
Authors: Mahdi Esmaeili, Mansour Tarafdar
Abstract:The goal of data mining algorithms is to discover useful information embedded in large databases. One of the most important data mining problems is discovery of frequently occurring patterns in sequential data. In a multidimensional sequence each event depends on more than one dimension. The search space is quite large and the serial algorithms are not scalable for very large datasets. To address this, it is necessary to study scalable parallel implementations of sequence mining algorithms. In this paper, we present a model for multidimensional sequence and describe a parallel algorithm based on data parallelism. Simulation experiments show good load balancing and scalable and acceptable speedup over different processors and problem sizes and demonstrate that our approach can works efficiently in a real parallel computing environment.
Keywords: Sequential Patterns, Data Mining, ParallelAlgorithm, Multidimensional Sequence DataProcedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1255
1135 Implementation of Parallel Interface for Microprocessor Trainer
Authors: Moe Moe Htun, Khin Htar Nwe
Abstract:In this paper, parallel interface for microprocessor trainer was implemented. A programmable parallel–port device such as the IC 8255A is initialized for simple input or output and for handshake input or output by choosing kinds of modes. The hardware connections and the programs can be used to interface microprocessor trainer and a personal computer by using IC 8255A. The assembly programs edited on PC-s editor can be downloaded to the trainer.
Keywords: Parallel I/O ports, parallel interface, trainer, two 8255 ICs.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3044
1134 Detecting the Edge of Multiple Images in Parallel
Authors: Prakash K. Aithal, U. Dinesh Acharya, Rajesh Gopakumar
Abstract:Edge is variation of brightness in an image. Edge detection is useful in many application areas such as finding forests, rivers from a satellite image, detecting broken bone in a medical image etc. The paper discusses about finding edge of multiple aerial images in parallel. The proposed work tested on 38 images 37 colored and one monochrome image. The time taken to process N images in parallel is equivalent to time taken to process 1 image in sequential. Message Passing Interface (MPI) and Open Computing Language (OpenCL) is used to achieve task and pixel level parallelism respectively.
Keywords: Edge detection, multicore, GPU, openCL, MPI.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2101
1133 High Performance in Parallel Data Integration: An Empirical Evaluation of the Ratio Between Processing Time and Number of Physical Nodes
Authors: Caspar von Seckendorff, Eldar Sultanow
Many studies have shown that parallelization decreases efficiency , . There are many reasons for these decrements. This paper investigates those which appear in the context of parallel data integration. Integration processes generally cannot be allocated to packages of identical size (i. e. tasks of identical complexity). The reason for this is unknown heterogeneous input data which result in variable task lengths. Process delay is defined by the slowest processing node. It leads to a detrimental effect on the total processing time. With a real world example, this study will show that while process delay does initially increase with the introduction of more nodes it ultimately decreases again after a certain point. The example will make use of the cloud computing platform Hadoop and be run inside Amazon-s EC2 compute cloud. A stochastic model will be set up which can explain this effect.
Keywords: Process delay, speedup, efficiency, parallel computing, data integration, E-Commerce, Amazon Elastic Compute Cloud (EC2), Hadoop, Nutch.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1527
1132 Strip Decomposition Parallelization of Fast Direct Poisson Solver on a 3D Cartesian Staggered Grid
Authors: Minh Vuong Pham, Frédéric Plourde, Son Doan Kim
A strip domain decomposition parallel algorithm for fast direct Poisson solver is presented on a 3D Cartesian staggered grid. The parallel algorithm follows the principles of sequential algorithm for fast direct Poisson solver. Both Dirichlet and Neumann boundary conditions are addressed. Several test cases are likewise addressed in order to shed light on accuracy and efficiency in the strip domain parallelization algorithm. Actually the current implementation shows a very high efficiency when dealing with a large grid mesh up to 3.6 * 109 under massive parallel approach, which explicitly demonstrates that the proposed algorithm is ready for massive parallel computing.
Keywords: Strip-decomposition, parallelization, fast directpoisson solver.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1949
1131 Parallelization of Ensemble Kalman Filter (EnKF) for Oil Reservoirs with Time-lapse Seismic Data
Authors: Md Khairullah, Hai-Xiang Lin, Remus G. Hanea, Arnold W. Heemink
In this paper we describe the design and implementation of a parallel algorithm for data assimilation with ensemble Kalman filter (EnKF) for oil reservoir history matching problem. The use of large number of observations from time-lapse seismic leads to a large turnaround time for the analysis step, in addition to the time consuming simulations of the realizations. For efficient parallelization it is important to consider parallel computation at the analysis step. Our experiments show that parallelization of the analysis step in addition to the forecast step has good scalability, exploiting the same set of resources with some additional efforts.
Keywords: EnKF, Data assimilation, Parallel computing, Parallel efficiency.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2125
1130 Efficient Implementation of Serial and Parallel Support Vector Machine Training with a Multi-Parameter Kernel for Large-Scale Data Mining
Authors: Tatjana Eitrich, Bruno Lang
This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.
Keywords: Support Vector Machines, Shared Memory Parallel Computing, Large DataProcedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1466
1129 A Technique for Reachability Graph Generation for the Petri Net Models of Parallel Processes
Authors: Farooq Ahmad, Hejiao Huang, Xiaolong Wang
Abstract:Reachability graph (RG) generation suffers from the problem of exponential space and time complexity. To alleviate the more critical problem of time complexity, this paper presents the new approach for RG generation for the Petri net (PN) models of parallel processes. Independent RGs for each parallel process in the PN structure are generated in parallel and cross-product of these RGs turns into the exhaustive state space from which the RG of given parallel system is determined. The complexity analysis of the presented algorithm illuminates significant decrease in the time complexity cost of RG generation. The proposed technique is applicable to parallel programs having multiple threads with the synchronization problem.
Keywords: Parallel processes, Petri net, reachability graph, time complexity.Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1897