Search results for: parallel file system.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8877

Search results for: parallel file system.

8757 A High Performance MPI for Parallel and Distributed Computing

Authors: Prabu D., Vanamala V., Sanjeeb Kumar Deka, Sridharan R., Prahlada Rao B. B., Mohanram N.

Abstract:

Message Passing Interface is widely used for Parallel and Distributed Computing. MPICH and LAM are popular open source MPIs available to the parallel computing community also there are commercial MPIs, which performs better than MPICH etc. In this paper, we discuss a commercial Message Passing Interface, CMPI (C-DAC Message Passing Interface). C-MPI is an optimized MPI for CLUMPS. It is found to be faster and more robust compared to MPICH. We have compared performance of C-MPI and MPICH on Gigabit Ethernet network.

Keywords: C-MPI, C-VIA, HPC, MPICH, P-COMS, PMB

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1558
8756 Security over OFDM Fading Channels with Friendly Jammer

Authors: Munnujahan Ara

Abstract:

In this paper, we investigate the effect of friendly jamming power allocation strategies on the achievable average secrecy rate over a bank of parallel fading wiretap channels. We investigate the achievable average secrecy rate in parallel fading wiretap channels subject to Rayleigh and Rician fading. The achievable average secrecy rate, due to the presence of a line-of-sight component in the jammer channel is also evaluated. Moreover, we study the detrimental effect of correlation across the parallel sub-channels, and evaluate the corresponding decrease in the achievable average secrecy rate for the various fading configurations. We also investigate the tradeoff between the transmission power and the jamming power for a fixed total power budget. Our results, which are applicable to current orthogonal frequency division multiplexing (OFDM) communications systems, shed further light on the achievable average secrecy rates over a bank of parallel fading channels in the presence of friendly jammers.

Keywords: Fading parallel channels, Wire-tap channel, OFDM, Secrecy capacity, Power allocation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2254
8755 Strip Decomposition Parallelization of Fast Direct Poisson Solver on a 3D Cartesian Staggered Grid

Authors: Minh Vuong Pham, Frédéric Plourde, Son Doan Kim

Abstract:

A strip domain decomposition parallel algorithm for fast direct Poisson solver is presented on a 3D Cartesian staggered grid. The parallel algorithm follows the principles of sequential algorithm for fast direct Poisson solver. Both Dirichlet and Neumann boundary conditions are addressed. Several test cases are likewise addressed in order to shed light on accuracy and efficiency in the strip domain parallelization algorithm. Actually the current implementation shows a very high efficiency when dealing with a large grid mesh up to 3.6 * 109 under massive parallel approach, which explicitly demonstrates that the proposed algorithm is ready for massive parallel computing.

Keywords: Strip-decomposition, parallelization, fast directpoisson solver.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2049
8754 Parallelization of Ensemble Kalman Filter (EnKF) for Oil Reservoirs with Time-lapse Seismic Data

Authors: Md Khairullah, Hai-Xiang Lin, Remus G. Hanea, Arnold W. Heemink

Abstract:

In this paper we describe the design and implementation of a parallel algorithm for data assimilation with ensemble Kalman filter (EnKF) for oil reservoir history matching problem. The use of large number of observations from time-lapse seismic leads to a large turnaround time for the analysis step, in addition to the time consuming simulations of the realizations. For efficient parallelization it is important to consider parallel computation at the analysis step. Our experiments show that parallelization of the analysis step in addition to the forecast step has good scalability, exploiting the same set of resources with some additional efforts.

Keywords: EnKF, Data assimilation, Parallel computing, Parallel efficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2283
8753 Efficient Implementation of Serial and Parallel Support Vector Machine Training with a Multi-Parameter Kernel for Large-Scale Data Mining

Authors: Tatjana Eitrich, Bruno Lang

Abstract:

This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.

Keywords: Support Vector Machines, Shared Memory Parallel Computing, Large Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1580
8752 Performance Analysis of List Scheduling in Heterogeneous Computing Systems

Authors: Keqin Li

Abstract:

Given a parallel program to be executed on a heterogeneous computing system, the overall execution time of the program is determined by a schedule. In this paper, we analyze the worst-case performance of the list scheduling algorithm for scheduling tasks of a parallel program in a mixed-machine heterogeneous computing system such that the total execution time of the program is minimized. We prove tight lower and upper bounds for the worst-case performance ratio of the list scheduling algorithm. We also examine the average-case performance of the list scheduling algorithm. Our experimental data reveal that the average-case performance of the list scheduling algorithm is much better than the worst-case performance and is very close to optimal, except for large systems with large heterogeneity. Thus, the list scheduling algorithm is very useful in real applications.

Keywords: Average-case performance, list scheduling algorithm, mixed-machine heterogeneous computing system, worst-case performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1352
8751 Simulation Data Management Approach for Developing Adaptronic Systems – The W-Model Methodology

Authors: Roland S. Nattermann, Reiner Anderl

Abstract:

Existing proceeding-models for the development of mechatronic systems provide a largely parallel action in the detailed development. This parallel approach is to take place also largely independent of one another in the various disciplines involved. An approach for a new proceeding-model provides a further development of existing models to use for the development of Adaptronic Systems. This approach is based on an intermediate integration and an abstract modeling of the adaptronic system. Based on this system-model a simulation of the global system behavior, due to external and internal factors or Forces is developed. For the intermediate integration a special data management system is used. According to the presented approach this data management system has a number of functions that are not part of the "normal" PDM functionality. Therefore a concept for a new data management system for the development of Adaptive system is presented in this paper. This concept divides the functions into six layers. In the first layer a system model is created, which divides the adaptronic system based on its components and the various technical disciplines. Moreover, the parameters and properties of the system are modeled and linked together with the requirements and the system model. The modeled parameters and properties result in a network which is analyzed in the second layer. From this analysis necessary adjustments to individual components for specific manipulation of the system behavior can be determined. The third layer contains an automatic abstract simulation of the system behavior. This simulation is a precursor for network analysis and serves as a filter. By the network analysis and simulation changes to system components are examined and necessary adjustments to other components are calculated. The other layers of the concept treat the automatic calculation of system reliability, the "normal" PDM-functionality and the integration of discipline-specific data into the system model. A prototypical implementation of an appropriate data management with the addition of an automatic system development is being implemented using the data management system ENOVIA SmarTeam V5 and the simulation system MATLAB.

Keywords: Adaptronic, Data-Management, LOEWE-CentreAdRIA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2371
8750 Integrated Method for Detection of Unknown Steganographic Content

Authors: Magdalena Pejas

Abstract:

This article concerns the presentation of an integrated method for detection of steganographic content embedded by new unknown programs. The method is based on data mining and aggregated hypothesis testing. The article contains the theoretical basics used to deploy the proposed detection system and the description of improvement proposed for the basic system idea. Further main results of experiments and implementation details are collected and described. Finally example results of the tests are presented.

Keywords: Steganography, steganalysis, data embedding, data mining, feature extraction, knowledge base, system learning, hypothesis testing, error estimation, black box program, file structure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1568
8749 Parallel Distributed Computational Microcontroller System for Adaptive Antenna Downlink Transmitter Power Optimization

Authors: K. Prajindra Sankar, S.K. Tiong, S.P. Johnny Koh

Abstract:

This paper presents a tested research concept that implements a complex evolutionary algorithm, genetic algorithm (GA), in a multi-microcontroller environment. Parallel Distributed Genetic Algorithm (PDGA) is employed in adaptive beam forming technique to reduce power usage of adaptive antenna at WCDMA base station. Adaptive antenna has dynamic beam that requires more advanced beam forming algorithm such as genetic algorithm which requires heavy computation and memory space. Microcontrollers are low resource platforms that are normally not associated with GAs, which are typically resource intensive. The aim of this project was to design a cooperative multiprocessor system by expanding the role of small scale PIC microcontrollers to optimize WCDMA base station transmitter power. Implementation results have shown that PDGA multi-microcontroller system returned optimal transmitted power compared to conventional GA.

Keywords: Microcontroller, Genetic Algorithm, Adaptiveantenna, Power optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1788
8748 Optimal Document Archiving and Fast Information Retrieval

Authors: Hazem M. El-Bakry, Ahmed A. Mohammed

Abstract:

In this paper, an intelligent algorithm for optimal document archiving is presented. It is kown that electronic archives are very important for information system management. Minimizing the size of the stored data in electronic archive is a main issue to reduce the physical storage area. Here, the effect of different types of Arabic fonts on electronic archives size is discussed. Simulation results show that PDF is the best file format for storage of the Arabic documents in electronic archive. Furthermore, fast information detection in a given PDF file is introduced. Such approach uses fast neural networks (FNNs) implemented in the frequency domain. The operation of these networks relies on performing cross correlation in the frequency domain rather than spatial one. It is proved mathematically and practically that the number of computation steps required for the presented FNNs is less than that needed by conventional neural networks (CNNs). Simulation results using MATLAB confirm the theoretical computations.

Keywords: Information Storage and Retrieval, Electronic Archiving, Fast Information Detection, Cross Correlation, Frequency Domain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1589
8747 A Watermarking Scheme for MP3 Audio Files

Authors: Dimitrios Koukopoulos, Yiannis Stamatiou

Abstract:

In this work, we present for the first time in our perception an efficient digital watermarking scheme for mpeg audio layer 3 files that operates directly in the compressed data domain, while manipulating the time and subband/channel domain. In addition, it does not need the original signal to detect the watermark. Our scheme was implemented taking special care for the efficient usage of the two limited resources of computer systems: time and space. It offers to the industrial user the capability of watermark embedding and detection in time immediately comparable to the real music time of the original audio file that depends on the mpeg compression, while the end user/audience does not face any artifacts or delays hearing the watermarked audio file. Furthermore, it overcomes the disadvantage of algorithms operating in the PCMData domain to be vulnerable to compression/recompression attacks, as it places the watermark in the scale factors domain and not in the digitized sound audio data. The strength of our scheme, that allows it to be used with success in both authentication and copyright protection, relies on the fact that it gives to the users the enhanced capability their ownership of the audio file not to be accomplished simply by detecting the bit pattern that comprises the watermark itself, but by showing that the legal owner knows a hard to compute property of the watermark.

Keywords: Audio watermarking, mpeg audio layer 3, hardinstance generation, NP-completeness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1654
8746 GPU-Accelerated Triangle Mesh Simplification Using Parallel Vertex Removal

Authors: Thomas Odaker, Dieter Kranzlmueller, Jens Volkert

Abstract:

We present an approach to triangle mesh simplification designed to be executed on the GPU. We use a quadric error metric to calculate an error value for each vertex of the mesh and order all vertices based on this value. This step is followed by the parallel removal of a number of vertices with the lowest calculated error values. To allow for the parallel removal of multiple vertices we use a set of per-vertex boundaries that prevent mesh foldovers even when simplification operations are performed on neighbouring vertices. We execute multiple iterations of the calculation of the vertex errors, ordering of the error values and removal of vertices until either a desired number of vertices remains in the mesh or a minimum error value is reached. This parallel approach is used to speed up the simplification process while maintaining mesh topology and avoiding foldovers at every step of the simplification.

Keywords: Computer graphics, half edge collapse, mesh simplification, precomputed simplification, topology preserving.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2797
8745 Advanced Robust PDC Fuzzy Control of Nonlinear Systems

Authors: M. Polanský

Abstract:

This paper introduces a new method called ARPDC (Advanced Robust Parallel Distributed Compensation) for automatic control of nonlinear systems. This method improves a quality of robust control by interpolating of robust and optimal controller. The weight of each controller is determined by an original criteria function for model validity and disturbance appreciation. ARPDC method is based on nonlinear Takagi-Sugeno (T-S) fuzzy systems and Parallel Distributed Compensation (PDC) control scheme. The relaxed stability conditions of ARPDC control of nominal system have been derived. The advantages of presented method are demonstrated on the inverse pendulum benchmark problem. From comparison between three different controllers (robust, optimal and ARPDC) follows, that ARPDC control is almost optimal with the robustness close to the robust controller. The results indicate that ARPDC algorithm can be a good alternative not only for a robust control, but in some cases also to an adaptive control of nonlinear systems.

Keywords: Robust control, optimal control, Takagi–Sugeno (TS) fuzzy models, linear matrix inequality (LMI), observer, Advanced Robust Parallel Distributed Compensation (ARPDC).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1578
8744 Fast Database Indexing for Large Protein Sequence Collections Using Parallel N-Gram Transformation Algorithm

Authors: Jehad A. H. Hammad, Nur'Aini binti Abdul Rashid

Abstract:

With the rapid development in the field of life sciences and the flooding of genomic information, the need for faster and scalable searching methods has become urgent. One of the approaches that were investigated is indexing. The indexing methods have been categorized into three categories which are the lengthbased index algorithms, transformation-based algorithms and mixed techniques-based algorithms. In this research, we focused on the transformation based methods. We embedded the N-gram method into the transformation-based method to build an inverted index table. We then applied the parallel methods to speed up the index building time and to reduce the overall retrieval time when querying the genomic database. Our experiments show that the use of N-Gram transformation algorithm is an economical solution; it saves time and space too. The result shows that the size of the index is smaller than the size of the dataset when the size of N-Gram is 5 and 6. The parallel N-Gram transformation algorithm-s results indicate that the uses of parallel programming with large dataset are promising which can be improved further.

Keywords: Biological sequence, Database index, N-gram indexing, Parallel computing, Sequence retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2139
8743 Detecting the Edge of Multiple Images in Parallel

Authors: Prakash K. Aithal, U. Dinesh Acharya, Rajesh Gopakumar

Abstract:

Edge is variation of brightness in an image. Edge detection is useful in many application areas such as finding forests, rivers from a satellite image, detecting broken bone in a medical image etc. The paper discusses about finding edge of multiple aerial images in parallel. The proposed work tested on 38 images 37 colored and one monochrome image. The time taken to process N images in parallel is equivalent to time taken to process 1 image in sequential. Message Passing Interface (MPI) and Open Computing Language (OpenCL) is used to achieve task and pixel level parallelism respectively.

Keywords: Edge detection, multicore, GPU, openCL, MPI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2342
8742 An Efficient Watermarking Method for MP3 Audio Files

Authors: Dimitrios Koukopoulos, Yiannis Stamatiou

Abstract:

In this work, we present for the first time in our perception an efficient digital watermarking scheme for mpeg audio layer 3 files that operates directly in the compressed data domain, while manipulating the time and subband/channel domain. In addition, it does not need the original signal to detect the watermark. Our scheme was implemented taking special care for the efficient usage of the two limited resources of computer systems: time and space. It offers to the industrial user the capability of watermark embedding and detection in time immediately comparable to the real music time of the original audio file that depends on the mpeg compression, while the end user/audience does not face any artifacts or delays hearing the watermarked audio file. Furthermore, it overcomes the disadvantage of algorithms operating in the PCMData domain to be vulnerable to compression/recompression attacks, as it places the watermark in the scale factors domain and not in the digitized sound audio data. The strength of our scheme, that allows it to be used with success in both authentication and copyright protection, relies on the fact that it gives to the users the enhanced capability their ownership of the audio file not to be accomplished simply by detecting the bit pattern that comprises the watermark itself, but by showing that the legal owner knows a hard to compute property of the watermark.

Keywords: Audio watermarking, mpeg audio layer 3, hard instance generation, NP-completeness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1838
8741 Singularity Loci of Actuation Schemes for 3RRR Planar Parallel Manipulator

Authors: S. Ramana Babu, V. Ramachandra Raju, K. Ramji

Abstract:

This paper presents the effect of actuation schemes on the performance of parallel manipulators and also how the singularity loci have been changed in the reachable workspace of the manipulator with the choice of actuation scheme to drive the manipulator. The performance of the eight possible actuation schemes of 3RRR planar parallel manipulator is compared with each other. The optimal design problem is formulated to find the manipulator geometry that maximizes the singularity free conditioned workspace for all the eight actuation cases, the optimization problem is solved by using genetic algorithms.

Keywords: Actuation schemes, GCI, genetic algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1631
8740 Unrelated Parallel Machines Scheduling Problem Using an Ant Colony Optimization Approach

Authors: Y. K. Lin, H. T. Hsieh, F. Y. Hsieh

Abstract:

Total weighted tardiness is a measure of customer satisfaction. Minimizing it represents satisfying the general requirement of on-time delivery. In this research, we consider an ant colony optimization (ACO) algorithm to solve the problem of scheduling unrelated parallel machines to minimize total weighted tardiness. The problem is NP-hard in the strong sense. Computational results show that the proposed ACO algorithm is giving promising results compared to other existing algorithms.

Keywords: ant colony optimization, total weighted tardiness, unrelated parallel machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1894
8739 Bounds on Reliability of Parallel Computer Interconnection Systems

Authors: Ranjan Kumar Dash, Chita Ranjan Tripathy

Abstract:

The evaluation of residual reliability of large sized parallel computer interconnection systems is not practicable with the existing methods. Under such conditions, one must go for approximation techniques which provide the upper bound and lower bound on this reliability. In this context, a new approximation method for providing bounds on residual reliability is proposed here. The proposed method is well supported by two algorithms for simulation purpose. The bounds on residual reliability of three different categories of interconnection topologies are efficiently found by using the proposed method

Keywords: Parallel computer network, reliability, probabilisticgraph, interconnection networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1346
8738 Minimizing Makespan Subject to Budget Limitation in Parallel Flow Shop

Authors: Amin Sahraeian

Abstract:

One of the criteria in production scheduling is Make Span, minimizing this criteria causes more efficiently use of the resources specially machinery and manpower. By assigning some budget to some of the operations the operation time of these activities reduces and affects the total completion time of all the operations (Make Span). In this paper this issue is practiced in parallel flow shops. At first we convert parallel flow shop to a network model and by using a linear programming approach it is identified in order to minimize make span (the completion time of the network) which activities (operations) are better to absorb the predetermined and limited budget. Minimizing the total completion time of all the activities in the network is equivalent to minimizing make span in production scheduling.

Keywords: parallel flow shop, make span, linear programming, budget

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1782
8737 Searchable Encryption in Cloud Storage

Authors: Ren-Junn Hwang, Chung-Chien Lu, Jain-Shing Wu

Abstract:

Cloud outsource storage is one of important services in cloud computing. Cloud users upload data to cloud servers to reduce the cost of managing data and maintaining hardware and software. To ensure data confidentiality, users can encrypt their files before uploading them to a cloud system. However, retrieving the target file from the encrypted files exactly is difficult for cloud server. This study proposes a protocol for performing multikeyword searches for encrypted cloud data by applying k-nearest neighbor technology. The protocol ranks the relevance scores of encrypted files and keywords, and prevents cloud servers from learning search keywords submitted by a cloud user. To reduce the costs of file transfer communication, the cloud server returns encrypted files in order of relevance. Moreover, when a cloud user inputs an incorrect keyword and the number of wrong alphabet does not exceed a given threshold; the user still can retrieve the target files from cloud server. In addition, the proposed scheme satisfies security requirements for outsourced data storage.

Keywords: Fault-tolerance search, multi-keywords search, outsource storage, ranked search, searchable encryption.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3083
8736 Novel Ridge Orientation Based Approach for Fingerprint Identification Using Co-Occurrence Matrix

Authors: Mehran Yazdi, Zahra Adelpour, Batoul Bahraini, Yasaman Keshtkar Jahromi

Abstract:

In this paper we use the property of co-occurrence matrix in finding parallel lines in binary pictures for fingerprint identification. In our proposed algorithm, we reduce the noise by filtering the fingerprint images and then transfer the fingerprint images to binary images using a proper threshold. Next, we divide the binary images into some regions having parallel lines in the same direction. The lines in each region have a specific angle that can be used for comparison. This method is simple, performs the comparison step quickly and has a good resistance in the presence of the noise.

Keywords: Parallel lines detection, co-occurrence matrix, fingerprint identification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1358
8735 Testing of Electronic Control Unit Communication Interface

Authors: Petr Šimek, Kamil Kostruk

Abstract:

This paper deals with the problem of testing the Electronic Control Unit (ECU) for the specified function validation. Modern ECUs have many functions which need to be tested. This process requires tracking between the test and the specification. The technique discussed in this paper explores the system for automating this process. The paper focuses on the introduction to the problem in general, then it describes the proposed test system concept and its principle. It looks at how the process of the ECU interface specification file for automated interface testing and test tracking works. In the end, the future possible development of the project is discussed.

Keywords: Electronic control unit testing, embedded system, test generate, test automation, process automation, CAN bus, Ethernet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 282
8734 Analytical Approach of the In-Pipe Robot on Branched Pipe Navigation and Its Solution

Authors: Yoon Koo Kang, Jung wan Park, Hyun Seok Yang

Abstract:

This paper determines most common model of in-pipe robots to derive its degree of freedom in order to compare with the necessary degree of freedom required for a system to move inside pipelines freely in order to derive analytical reason for losing control of in-pipe robots at branched pipe. DOF of most common mechanism in in-pipe robots can be calculated by considering the robot as a parallel manipulator. A new design based on previously researched in-pipe robot PAROYS has been suggested, and its possibility to overcome branched section has been simulated.

Keywords: Branched pipe, Degree of freedom, In-pipe robot, Parallel manipulator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2222
8733 Analyzing the Factors that Cause Parallel Performance Degradation in Parallel Graph-Based Computations Using Graph500

Authors: Mustafa Elfituri, Jonathan Cook

Abstract:

Recently, graph-based computations have become more important in large-scale scientific computing as they can provide a methodology to model many types of relations between independent objects. They are being actively used in fields as varied as biology, social networks, cybersecurity, and computer networks. At the same time, graph problems have some properties such as irregularity and poor locality that make their performance different than regular applications performance. Therefore, parallelizing graph algorithms is a hard and challenging task. Initial evidence is that standard computer architectures do not perform very well on graph algorithms. Little is known exactly what causes this. The Graph500 benchmark is a representative application for parallel graph-based computations, which have highly irregular data access and are driven more by traversing connected data than by computation. In this paper, we present results from analyzing the performance of various example implementations of Graph500, including a shared memory (OpenMP) version, a distributed (MPI) version, and a hybrid version. We measured and analyzed all the factors that affect its performance in order to identify possible changes that would improve its performance. Results are discussed in relation to what factors contribute to performance degradation.

Keywords: Graph computation, Graph500 benchmark, parallel architectures, parallel programming, workload characterization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 553
8732 A Study on the Cloud Simulation with a Network Topology Generator

Authors: Jun-Kwon Jung, Sung-Min Jung, Tae-Kyung Kim, Tai-Myoung Chung

Abstract:

CloudSim is a useful tool to simulate the cloud environment. It shows the service availability, the power consumption, and the network traffic of services on the cloud environment. Moreover, it supports to calculate a network communication delay through a network topology data easily. CloudSim allows inputting a file of topology data, but it does not provide any generating process. Thus, it needs the file of topology data generated from some other tools. The BRITE is typical network topology generator. Also, it supports various type of topology generating algorithms. If CloudSim can include the BRITE, network simulation for clouds is easier than existing version. This paper shows the potential of connection between BRITE and CloudSim. Also, it proposes the direction to link between them.

Keywords: Cloud, simulation, topology, BRITE, network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3780
8731 A Survey on Performance Tools for OpenMP

Authors: Mubarak S. Mohsen, Rosni Abdullah, Yong M. Teo

Abstract:

Advances in processors architecture, such as multicore, increase the size of complexity of parallel computer systems. With multi-core architecture there are different parallel languages that can be used to run parallel programs. One of these languages is OpenMP which embedded in C/Cµ or FORTRAN. Because of this new architecture and the complexity, it is very important to evaluate the performance of OpenMP constructs, kernels, and application program on multi-core systems. Performance is the activity of collecting the information about the execution characteristics of a program. Performance tools consists of at least three interfacing software layers, including instrumentation, measurement, and analysis. The instrumentation layer defines the measured performance events. The measurement layer determines what performance event is actually captured and how it is measured by the tool. The analysis layer processes the performance data and summarizes it into a form that can be displayed in performance tools. In this paper, a number of OpenMP performance tools are surveyed, explaining how each is used to collect, analyse, and display data collection.

Keywords: Parallel performance tools, OpenMP, multi-core.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1926
8730 Impact of Fair Share and its Configurations on Parallel Job Scheduling Algorithms

Authors: Sangsuree Vasupongayya

Abstract:

To provide a better understanding of fair share policies supported by current production schedulers and their impact on scheduling performance, A relative fair share policy supported in four well-known production job schedulers is evaluated in this study. The experimental results show that fair share indeed reduces heavy-demand users from dominating the system resources. However, the detailed per-user performance analysis show that some types of users may suffer unfairness under fair share, possibly due to priority mechanisms used by the current production schedulers. These users typically are not heavy-demands users but they have mixture of jobs that do not spread out.

Keywords: Fair share, Parallel job scheduler, Backfill, Measures

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2093
8729 Isotropic Stress Distribution in Cu/(001) Fe Two Sheets

Authors: A. Derardja, L. Baroura, M. Brioua

Abstract:

The nanotechnology based on epitaxial systems includes single or arranged misfit dislocations. In general, whatever is the type of dislocation or the geometry of the array formed by the dislocations; it is important for experimental studies to know exactly the stress distribution for which there is no analytical expression [1, 2]. This work, using a numerical analysis, deals with relaxation of epitaxial layers having at their interface a periodic network of edge misfit dislocations. The stress distribution is estimated by using isotropic elasticity. The results show that the thickness of the two sheets is a crucial parameter in the stress distributions and then in the profile of the two sheets. A comparative study between the case of single dislocation and the case of parallel network shows that the layers relaxed better when the interface is covered by a parallel arrangement of misfit. Consequently, a single dislocation at the interface produces an important stress field which can be reduced by inserting a parallel network of dislocations with suitable periodicity.

Keywords: Parallel array of misfit, interface, isotropic elasticity, single crystalline substrates, coherent interface

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1573
8728 Solving 94-bit ECDLP with 70 Computers in Parallel

Authors: Shunsuke Miyoshi, Yasuyuki Nogami, Takuya Kusaka, Nariyoshi Yamai

Abstract:

Elliptic curve discrete logarithm problem(ECDLP) is one of problems on which the security of pairing-based cryptography is based. This paper considers Pollard’s rho method to evaluate the security of ECDLP on Barreto-Naehrig(BN) curve that is an efficient pairing-friendly curve. Some techniques are proposed to make the rho method efficient. Especially, the group structure on BN curve, distinguished point method, and Montgomery trick are well-known techniques. This paper applies these techniques and shows its optimization. According to the experimental results for which a large-scale parallel system with MySQL is applied, 94-bit ECDLP was solved about 28 hours by parallelizing 71 computers.

Keywords: Pollard’s rho method, BN curve, Montgomery multiplication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1873