Search results for: parallel algorithm
4608 Algorithm Optimization to Sort in Parallel by Decreasing the Number of the Processors in SIMD (Single Instruction Multiple Data) Systems
Authors: Ali Hosseini
Abstract:
Paralleling is a mechanism to decrease the time necessary to execute the programs. Sorting is one of the important operations to be used in different systems in a way that the proper function of many algorithms and operations depend on sorted data. CRCW_SORT algorithm executes ‘N’ elements sorting in O(1) time on SIMD (Single Instruction Multiple Data) computers with n^2/2-n/2 number of processors. In this article having presented a mechanism by dividing the input string by the hinge element into two less strings the number of the processors to be used in sorting ‘N’ elements in O(1) time has decreased to n^2/8-n/4 in the best state; by this mechanism the best state is when the hinge element is the middle one and the worst state is when it is minimum. The findings from assessing the proposed algorithm by other methods on data collection and number of the processors indicate that the proposed algorithm uses less processors to sort during execution than other methods.Keywords: CRCW, SIMD (Single Instruction Multiple Data) computers, parallel computers, number of the processors
Procedia PDF Downloads 3124607 Parallel Pipelined Conjugate Gradient Algorithm on Heterogeneous Platforms
Authors: Sergey Kopysov, Nikita Nedozhogin, Leonid Tonkov
Abstract:
The article presents a parallel iterative solver for large sparse linear systems which can be used on a heterogeneous platform. Traditionally, the problem of solving linear systems does not scale well on multi-CPU/multi-GPUs clusters. For example, most of the attempts to implement the classical conjugate gradient method were at best counted in the same amount of time as the problem was enlarged. The paper proposes the pipelined variant of the conjugate gradient method (PCG), a formulation that is potentially better suited for hybrid CPU/GPU computing since it requires only one synchronization point per one iteration instead of two for standard CG. The standard and pipelined CG methods need the vector entries generated by the current GPU and other GPUs for matrix-vector products. So the communication between GPUs becomes a major performance bottleneck on multi GPU cluster. The article presents an approach to minimize the communications between parallel parts of algorithms. Additionally, computation and communication can be overlapped to reduce the impact of data exchange. Using the pipelined version of the CG method with one synchronization point, the possibility of asynchronous calculations and communications, load balancing between the CPU and GPU for solving the large linear systems allows for scalability. The algorithm is implemented with the combined use of technologies: MPI, OpenMP, and CUDA. We show that almost optimum speed up on 8-CPU/2GPU may be reached (relatively to a one GPU execution). The parallelized solver achieves a speedup of up to 5.49 times on 16 NVIDIA Tesla GPUs, as compared to one GPU.Keywords: conjugate gradient, GPU, parallel programming, pipelined algorithm
Procedia PDF Downloads 1654606 Identification of Vehicle Dynamic Parameters by Using Optimized Exciting Trajectory on 3- DOF Parallel Manipulator
Authors: Di Yao, Gunther Prokop, Kay Buttner
Abstract:
Dynamic parameters, including the center of gravity, mass and inertia moments of vehicle, play an essential role in vehicle simulation, collision test and real-time control of vehicle active systems. To identify the important vehicle dynamic parameters, a systematic parameter identification procedure is studied in this work. In the first step of the procedure, a conceptual parallel manipulator (virtual test rig), which possesses three rotational degrees-of-freedom, is firstly proposed. To realize kinematic characteristics of the conceptual parallel manipulator, the kinematic analysis consists of inverse kinematic and singularity architecture is carried out. Based on the Euler's rotation equations for rigid body dynamics, the dynamic model of parallel manipulator and derivation of measurement matrix for parameter identification are presented subsequently. In order to reduce the sensitivity of parameter identification to measurement noise and other unexpected disturbances, a parameter optimization process of searching for optimal exciting trajectory of parallel manipulator is conducted in the following section. For this purpose, the 321-Euler-angles defined by parameterized finite-Fourier-series are primarily used to describe the general exciting trajectory of parallel manipulator. To minimize the condition number of measurement matrix for achieving better parameter identification accuracy, the unknown coefficients of parameterized finite-Fourier-series are estimated by employing an iterative algorithm based on MATLAB®. Meanwhile, the iterative algorithm will ensure the parallel manipulator still keeps in an achievable working status during the execution of optimal exciting trajectory. It is showed that the proposed procedure and methods in this work can effectively identify the vehicle dynamic parameters and could be an important application of parallel manipulator in the fields of parameter identification and test rig development.Keywords: parameter identification, parallel manipulator, singularity architecture, dynamic modelling, exciting trajectory
Procedia PDF Downloads 2674605 Parallel Asynchronous Multi-Splitting Methods for Differential Algebraic Systems
Authors: Malika Elkyal
Abstract:
We consider an iterative parallel multi-splitting method for differential algebraic equations. The main feature of the proposed idea is to use the asynchronous form. We prove that the multi-splitting technique can effectively accelerate the convergent performance of the iterative process. The main characteristic of an asynchronous mode is that the local algorithm does not have to wait at predetermined messages to become available. We allow some processors to communicate more frequently than others, and we allow the communication delays to be substantial and unpredictable. Accordingly, we note that synchronous algorithms in the computer science sense are particular cases of our formulation of asynchronous one.Keywords: parallel methods, asynchronous mode, multisplitting, differential algebraic equations
Procedia PDF Downloads 5604604 Natural Convection between Two Parallel Wavy Plates
Authors: Si Abdallah Mayouf
Abstract:
In this work, the effects of the wavy surface on free convection heat transfer boundary layer flow between two parallel wavy plates have been studied numerically. The two plates are considered at a constant temperature. The equations and the boundary conditions are discretized by the finite difference scheme and solved numerically using the Gauss-Seidel algorithm. The important parameters in this problem are the amplitude of the wavy surfaces and the distance between the two wavy plates. Results are presented as velocity profiles, temperature profiles and local Nusselt number according to the important parameters.Keywords: free convection, wavy surface, parallel plates, fluid dynamics
Procedia PDF Downloads 3084603 GPU Accelerated Fractal Image Compression for Medical Imaging in Parallel Computing Platform
Authors: Md. Enamul Haque, Abdullah Al Kaisan, Mahmudur R. Saniat, Aminur Rahman
Abstract:
In this paper, we have implemented both sequential and parallel version of fractal image compression algorithms using CUDA (Compute Unified Device Architecture) programming model for parallelizing the program in Graphics Processing Unit for medical images, as they are highly similar within the image itself. There is several improvements in the implementation of the algorithm as well. Fractal image compression is based on the self similarity of an image, meaning an image having similarity in majority of the regions. We take this opportunity to implement the compression algorithm and monitor the effect of it using both parallel and sequential implementation. Fractal compression has the property of high compression rate and the dimensionless scheme. Compression scheme for fractal image is of two kinds, one is encoding and another is decoding. Encoding is very much computational expensive. On the other hand decoding is less computational. The application of fractal compression to medical images would allow obtaining much higher compression ratios. While the fractal magnification an inseparable feature of the fractal compression would be very useful in presenting the reconstructed image in a highly readable form. However, like all irreversible methods, the fractal compression is connected with the problem of information loss, which is especially troublesome in the medical imaging. A very time consuming encoding process, which can last even several hours, is another bothersome drawback of the fractal compression.Keywords: accelerated GPU, CUDA, parallel computing, fractal image compression
Procedia PDF Downloads 3364602 A Survey on Constraint Solving Approaches Using Parallel Architectures
Authors: Nebras Gharbi, Itebeddine Ghorbel
Abstract:
In the latest years and with the advancements of the multicore computing world, the constraint programming community tried to benefit from the capacity of new machines and make the best use of them through several parallel schemes for constraint solving. In this paper, we propose a survey of the different proposed approaches to solve Constraint Satisfaction Problems using parallel architectures. These approaches use in a different way a parallel architecture: the problem itself could be solved differently by several solvers or could be split over solvers.Keywords: constraint programming, parallel programming, constraint satisfaction problem, speed-up
Procedia PDF Downloads 3204601 Error Estimation for the Reconstruction Algorithm with Fan Beam Geometry
Authors: Nirmal Yadav, Tanuja Srivastava
Abstract:
Shannon theory is an exact method to recover a band limited signals from its sampled values in discrete implementation, using sinc interpolators. But sinc based results are not much satisfactory for band-limited calculations so that convolution with window function, having compact support, has been introduced. Convolution Backprojection algorithm with window function is an approximation algorithm. In this paper, the error has been calculated, arises due to this approximation nature of reconstruction algorithm. This result will be defined for fan beam projection data which is more faster than parallel beam projection.Keywords: computed tomography, convolution backprojection, radon transform, fan beam
Procedia PDF Downloads 4934600 Parallel Particle Swarm Optimization Optimized LDI Controller with Lyapunov Stability Criterion for Nonlinear Structural Systems
Authors: P. W. Tsai, W. L. Hong, C. W. Chen, C. Y. Chen
Abstract:
In this paper, we present a neural network (NN) based approach represent a nonlinear Tagagi-Sugeno (T-S) system. A linear differential inclusion (LDI) state-space representation is utilized to deal with the NN models. Taking advantage of the LDI representation, the stability conditions and controller design are derived for a class of nonlinear structural systems. Moreover, the concept of utilizing the Parallel Particle Swarm Optimization (PPSO) algorithm to solve the common P matrix under the stability criteria is given in this paper.Keywords: Lyapunov stability, parallel particle swarm optimization, linear differential inclusion, artificial intelligence
Procedia PDF Downloads 6564599 Parallel Computing: Offloading Matrix Multiplication to GPU
Authors: Bharath R., Tharun Sai N., Bhuvan G.
Abstract:
This project focuses on developing a Parallel Computing method aimed at optimizing matrix multiplication through GPU acceleration. Addressing algorithmic challenges, GPU programming intricacies, and integration issues, the project aims to enhance efficiency and scalability. The methodology involves algorithm design, GPU programming, and optimization techniques. Future plans include advanced optimizations, extended functionality, and integration with high-level frameworks. User engagement is emphasized through user-friendly interfaces, open- source collaboration, and continuous refinement based on feedback. The project's impact extends to significantly improving matrix multiplication performance in scientific computing and machine learning applications.Keywords: matrix multiplication, parallel processing, cuda, performance boost, neural networks
Procedia PDF Downloads 604598 Analysis of Tandem Detonator Algorithm Optimized by Quantum Algorithm
Authors: Tomasz Robert Kuczerski
Abstract:
The high complexity of the algorithm of the autonomous tandem detonator system creates an optimization problem due to the parallel operation of several machine states of the system. Many years of experience and classic analyses have led to a partially optimized model. Limitations on the energy resources of this class of autonomous systems make it necessary to search for more effective methods of optimisation. The use of the Quantum Approximate Optimization Algorithm (QAOA) in these studies shows the most promising results. With the help of multiple evaluations of several qubit quantum circuits, proper results of variable parameter optimization were obtained. In addition, it was observed that the increase in the number of assessments does not result in further efficient growth due to the increasing complexity of optimising variables. The tests confirmed the effectiveness of the QAOA optimization method.Keywords: algorithm analysis, autonomous system, quantum optimization, tandem detonator
Procedia PDF Downloads 924597 Implementation of CNV-CH Algorithm Using Map-Reduce Approach
Authors: Aishik Deb, Rituparna Sinha
Abstract:
We have developed an algorithm to detect the abnormal segment/"structural variation in the genome across a number of samples. We have worked on simulated as well as real data from the BAM Files and have designed a segmentation algorithm where abnormal segments are detected. This algorithm aims to improve the accuracy and performance of the existing CNV-CH algorithm. The next-generation sequencing (NGS) approach is very fast and can generate large sequences in a reasonable time. So the huge volume of sequence information gives rise to the need for Big Data and parallel approaches of segmentation. Therefore, we have designed a map-reduce approach for the existing CNV-CH algorithm where a large amount of sequence data can be segmented and structural variations in the human genome can be detected. We have compared the efficiency of the traditional and map-reduce algorithms with respect to precision, sensitivity, and F-Score. The advantages of using our algorithm are that it is fast and has better accuracy. This algorithm can be applied to detect structural variations within a genome, which in turn can be used to detect various genetic disorders such as cancer, etc. The defects may be caused by new mutations or changes to the DNA and generally result in abnormally high or low base coverage and quantification values.Keywords: cancer detection, convex hull segmentation, map reduce, next generation sequencing
Procedia PDF Downloads 1364596 A Parallel Implementation of Artificial Bee Colony Algorithm within CUDA Architecture
Authors: Selcuk Aslan, Dervis Karaboga, Celal Ozturk
Abstract:
Artificial Bee Colony (ABC) algorithm is one of the most successful swarm intelligence based metaheuristics. It has been applied to a number of constrained or unconstrained numerical and combinatorial optimization problems. In this paper, we presented a parallelized version of ABC algorithm by adapting employed and onlooker bee phases to the Compute Unified Device Architecture (CUDA) platform which is a graphical processing unit (GPU) programming environment by NVIDIA. The execution speed and obtained results of the proposed approach and sequential version of ABC algorithm are compared on functions that are typically used as benchmarks for optimization algorithms. Tests on standard benchmark functions with different colony size and number of parameters showed that proposed parallelization approach for ABC algorithm decreases the execution time consumed by the employed and onlooker bee phases in total and achieved similar or better quality of the results compared to the standard sequential implementation of the ABC algorithm.Keywords: Artificial Bee Colony algorithm, GPU computing, swarm intelligence, parallelization
Procedia PDF Downloads 3794595 Parallel Multisplitting Methods for Differential Systems
Authors: Malika El Kyal, Ahmed Machmoum
Abstract:
We prove the superlinear convergence of asynchronous multi-splitting methods applied to differential equations. This study is based on the technique of nested sets. It permits to specify kind of the convergence in the asynchronous mode.The main characteristic of an asynchronous mode is that the local algorithm not have to wait at predetermined messages to become available. We allow some processors to communicate more frequently than others, and we allow the communication delays to be substantial and unpredictable. Note that synchronous algorithms in the computer science sense are particular cases of our formulation of asynchronous one.Keywords: parallel methods, asynchronous mode, multisplitting, ODE
Procedia PDF Downloads 5264594 The Load Balancing Algorithm for the Star Interconnection Network
Authors: Ahmad M. Awwad, Jehad Al-Sadi
Abstract:
The star network is one of the promising interconnection networks for future high speed parallel computers, it is expected to be one of the future-generation networks. The star network is both edge and vertex symmetry, it was shown to have many gorgeous topological proprieties also it is owns hierarchical structure framework. Although much of the research work has been done on this promising network in literature, it still suffers from having enough algorithms for load balancing problem. In this paper we try to work on this issue by investigating and proposing an efficient algorithm for load balancing problem for the star network. The proposed algorithm is called Star Clustered Dimension Exchange Method SCDEM to be implemented on the star network. The proposed algorithm is based on the Clustered Dimension Exchange Method (CDEM). The SCDEM algorithm is shown to be efficient in redistributing the load balancing as evenly as possible among all nodes of different factor networks.Keywords: load balancing, star network, interconnection networks, algorithm
Procedia PDF Downloads 3204593 Discrete State Prediction Algorithm Design with Self Performance Enhancement Capacity
Authors: Smail Tigani, Mohamed Ouzzif
Abstract:
This work presents a discrete quantitative state prediction algorithm with intelligent behavior making it able to self-improve some performance aspects. The specificity of this algorithm is the capacity of self-rectification of the prediction strategy before the final decision. The auto-rectification mechanism is based on two parallel mathematical models. In one hand, the algorithm predicts the next state based on event transition matrix updated after each observation. In the other hand, the algorithm extracts its residues trend with a linear regression representing historical residues data-points in order to rectify the first decision if needs. For a normal distribution, the interactivity between the two models allows the algorithm to self-optimize its performance and then make better prediction. Designed key performance indicator, computed during a Monte Carlo simulation, shows the advantages of the proposed approach compared with traditional one.Keywords: discrete state, Markov Chains, linear regression, auto-adaptive systems, decision making, Monte Carlo Simulation
Procedia PDF Downloads 4984592 Optimizing Parallel Computing Systems: A Java-Based Approach to Modeling and Performance Analysis
Authors: Maher Ali Rusho, Sudipta Halder
Abstract:
The purpose of the study is to develop optimal solutions for models of parallel computing systems using the Java language. During the study, programmes were written for the examined models of parallel computing systems. The result of the parallel sorting code is the output of a sorted array of random numbers. When processing data in parallel, the time spent on processing and the first elements of the list of squared numbers are displayed. When processing requests asynchronously, processing completion messages are displayed for each task with a slight delay. The main results include the development of optimisation methods for algorithms and processes, such as the division of tasks into subtasks, the use of non-blocking algorithms, effective memory management, and load balancing, as well as the construction of diagrams and comparison of these methods by characteristics, including descriptions, implementation examples, and advantages. In addition, various specialised libraries were analysed to improve the performance and scalability of the models. The results of the work performed showed a substantial improvement in response time, bandwidth, and resource efficiency in parallel computing systems. Scalability and load analysis assessments were conducted, demonstrating how the system responds to an increase in data volume or the number of threads. Profiling tools were used to analyse performance in detail and identify bottlenecks in models, which improved the architecture and implementation of parallel computing systems. The obtained results emphasise the importance of choosing the right methods and tools for optimising parallel computing systems, which can substantially improve their performance and efficiency.Keywords: algorithm optimisation, memory management, load balancing, performance profiling, asynchronous programming.
Procedia PDF Downloads 144591 A Genetic Algorithm Based Permutation and Non-Permutation Scheduling Heuristics for Finite Capacity Material Requirement Planning Problem
Authors: Watchara Songserm, Teeradej Wuttipornpun
Abstract:
This paper presents a genetic algorithm based permutation and non-permutation scheduling heuristics (GAPNP) to solve a multi-stage finite capacity material requirement planning (FCMRP) problem in automotive assembly flow shop with unrelated parallel machines. In the algorithm, the sequences of orders are iteratively improved by the GA characteristics, whereas the required operations are scheduled based on the presented permutation and non-permutation heuristics. Finally, a linear programming is applied to minimize the total cost. The presented GAPNP algorithm is evaluated by using real datasets from automotive companies. The required parameters for GAPNP are intently tuned to obtain a common parameter setting for all case studies. The results show that GAPNP significantly outperforms the benchmark algorithm about 30% on average.Keywords: capacitated MRP, genetic algorithm, linear programming, automotive industries, flow shop, application in industry
Procedia PDF Downloads 4904590 A Design of Elliptic Curve Cryptography Processor based on SM2 over GF(p)
Authors: Shiji Hu, Lei Li, Wanting Zhou, DaoHong Yang
Abstract:
The data encryption, is the foundation of today’s communication. On this basis, how to improve the speed of data encryption and decryption is always a problem that scholars work for. In this paper, we proposed an elliptic curve crypto processor architecture based on SM2 prime field. In terms of hardware implementation, we optimized the algorithms in different stages of the structure. In finite field modulo operation, we proposed an optimized improvement of Karatsuba-Ofman multiplication algorithm, and shorten the critical path through pipeline structure in the algorithm implementation. Based on SM2 recommended prime field, a fast modular reduction algorithm is used to reduce 512-bit wide data obtained from the multiplication unit. The radix-4 extended Euclidean algorithm was used to realize the conversion between affine coordinate system and Jacobi projective coordinate system. In the parallel scheduling of point operations on elliptic curves, we proposed a three-level parallel structure of point addition and point double based on the Jacobian projective coordinate system. Combined with the scalar multiplication algorithm, we added mutual pre-operation to the point addition and double point operation to improve the efficiency of the scalar point multiplication. The proposed ECC hardware architecture was verified and implemented on Xilinx Virtex-7 and ZYNQ-7 platforms, and each 256-bit scalar multiplication operation took 0.275ms. The performance for handling scalar multiplication is 32 times that of CPU(dual-core ARM Cortex-A9).Keywords: Elliptic curve cryptosystems, SM2, modular multiplication, point multiplication.
Procedia PDF Downloads 1004589 Parallelization by Domain Decomposition for 1-D Sugarcane Equation with Message Passing Interface
Authors: Ewedafe Simon Uzezi
Abstract:
In this paper we presented a method based on Domain Decomposition (DD) for parallelization of 1-D Sugarcane Equation on parallel platform with parallel paradigms on Master-Slave platform using Message Passing Interface (MPI). The 1-D Sugarcane Equation was discretized using explicit method of discretization requiring evaluation nof temporal and spatial distribution of temperature. This platform gives better predictions of the effects of temperature distribution of the sugarcane problem. This work presented parallel overheads with overlapping communication and communication across parallel computers with numerical results across different block sizes with scalability. However, performance improvement strategies from the DD on various mesh sizes were compared experimentally and parallel results show speedup and efficiency for the parallel algorithms design.Keywords: sugarcane, parallelization, explicit method, domain decomposition, MPI
Procedia PDF Downloads 254588 Modal FDTD Method for Wave Propagation Modeling Customized for Parallel Computing
Authors: H. Samadiyeh, R. Khajavi
Abstract:
A new FD-based procedure, modal finite difference method (MFDM), is proposed for seismic wave propagation modeling, in which simulation is dealt with in the modal space. The method employs eigenvalues of a characteristic matrix formed by appropriate time-space FD stencils. Since MFD runs for different modes are totally independent of each other, MFDM can easily be parallelized while considerable simplicity in parallel-algorithm is also achieved. There is no requirement to any domain-decomposition procedure and inter-core data exchange. More important is the possibility to skip processing of less-significant modes, which enables one to adjust the procedure up to the level of accuracy needed. Thus, in addition to considerable ease of parallel programming, computation and storage costs are significantly reduced. The method is qualified for its efficiency by some numerical examples.Keywords: Finite Difference Method, Graphics Processing Unit (GPU), Message Passing Interface (MPI), Modal, Wave propagation
Procedia PDF Downloads 2974587 Adaptive Multipath Mitigation Acquisition Approach for Global Positioning System Software Receivers
Authors: Animut Meseret Simachew
Abstract:
Parallel Code Phase Search Acquisition (PCSA) Algorithm has been considered as a promising method in GPS software receivers for detection and estimation of the accurate correlation peak between the received Global Positioning System (GPS) signal and locally generated replicas. GPS signal acquisition in highly dense multipath environments is the main research challenge. In this work, we proposed a robust variable step-size (RVSS) PCSA algorithm based on fast frequency transform (FFT) filtering technique to mitigate short time delay multipath signals. Simulation results reveal the effectiveness of the proposed algorithm over the conventional PCSA algorithm. The proposed RVSS-PCSA algorithm equalizes the received carrier wiped-off signal with locally generated C/A code.Keywords: adaptive PCSA, detection and estimation, GPS signal acquisition, GPS software receiver
Procedia PDF Downloads 1174586 Performance Evaluation of Parallel Surface Modeling and Generation on Actual and Virtual Multicore Systems
Authors: Nyeng P. Gyang
Abstract:
Even though past, current and future trends suggest that multicore and cloud computing systems are increasingly prevalent/ubiquitous, this class of parallel systems is nonetheless underutilized, in general, and barely used for research on employing parallel Delaunay triangulation for parallel surface modeling and generation, in particular. The performances, of actual/physical and virtual/cloud multicore systems/machines, at executing various algorithms, which implement various parallelization strategies of the incremental insertion technique of the Delaunay triangulation algorithm, were evaluated. T-tests were run on the data collected, in order to determine whether various performance metrics differences (including execution time, speedup and efficiency) were statistically significant. Results show that the actual machine is approximately twice faster than the virtual machine at executing the same programs for the various parallelization strategies. Results, which furnish the scalability behaviors of the various parallelization strategies, also show that some of the differences between the performances of these systems, during different runs of the algorithms on the systems, were statistically significant. A few pseudo superlinear speedup results, which were computed from the raw data collected, are not true superlinear speedup values. These pseudo superlinear speedup values, which arise as a result of one way of computing speedups, disappear and give way to asymmetric speedups, which are the accurate kind of speedups that occur in the experiments performed.Keywords: cloud computing systems, multicore systems, parallel Delaunay triangulation, parallel surface modeling and generation
Procedia PDF Downloads 2064585 Performance Evaluation of Task Scheduling Algorithm on LCQ Network
Authors: Zaki Ahmad Khan, Jamshed Siddiqui, Abdus Samad
Abstract:
The Scheduling and mapping of tasks on a set of processors is considered as a critical problem in parallel and distributed computing system. This paper deals with the problem of dynamic scheduling on a special type of multiprocessor architecture known as Linear Crossed Cube (LCQ) network. This proposed multiprocessor is a hybrid network which combines the features of both linear type of architectures as well as cube based architectures. Two standard dynamic scheduling schemes namely Minimum Distance Scheduling (MDS) and Two Round Scheduling (TRS) schemes are implemented on the LCQ network. Parallel tasks are mapped and the imbalance of load is evaluated on different set of processors in LCQ network. The simulations results are evaluated and effort is made by means of through analysis of the results to obtain the best solution for the given network in term of load imbalance left and execution time. The other performance matrices like speedup and efficiency are also evaluated with the given dynamic algorithms.Keywords: dynamic algorithm, load imbalance, mapping, task scheduling
Procedia PDF Downloads 4514584 Parallel Multisplitting Methods for DAE’s
Authors: Ahmed Machmoum, Malika El Kyal
Abstract:
We consider iterative parallel multi-splitting method for differential algebraic equations. The main feature of the proposed idea is to use the asynchronous form. We prove that the multi-splitting technique can effectively accelerate the convergent performance of the iterative process. The main characteristic of an asynchronous mode is that the local algorithm not have to wait at predetermined messages to become available. We allow some processors to communicate more frequently than others, and we allow the communication delays tobe substantial and unpredictable. Note that synchronous algorithms in the computer science sense are particular cases of our formulation of asynchronous one.Keywords: computer, multi-splitting methods, asynchronous mode, differential algebraic systems
Procedia PDF Downloads 5494583 Dynamic Analysis of Offshore 2-HUS/U Parallel Platform
Authors: Xie Kefeng, Zhang He
Abstract:
For the stability and control demand of offshore small floating platform, a 2-HUS/U parallel mechanism was presented as offshore platform. Inverse kinematics was obtained by institutional constraint equation, and the dynamic model of offshore 2-HUS/U parallel platform was derived based on rigid body’s Lagrangian method. The equivalent moment of inertia, damping and driving force/torque variation of offshore 2-HUS/U parallel platform were analyzed. A numerical example shows that, for parallel platform of given motion, system’s equivalent inertia changes 1.25 times maximally. During the movement of platform, they change dramatically with the system configuration and have coupling characteristics. The maximum equivalent drive torque is 800 N. At the same time, the curve of platform’s driving force/torque is smooth and has good sine features. The control system needs to be adjusted according to kinetic equation during stability and control and it provides a basis for the optimization of control system.Keywords: 2-HUS/U platform, dynamics, Lagrange, parallel platform
Procedia PDF Downloads 3454582 Designing a Robust Controller for a 6 Linkage Robot
Authors: G. Khamooshian
Abstract:
One of the main points of application of the mechanisms of the series and parallel is the subject of managing them. The control of this mechanism and similar mechanisms is one that has always been the intention of the scholars. On the other hand, modeling the behavior of the system is difficult due to the large number of its parameters, and it leads to complex equations that are difficult to solve and eventually difficult to control. In this paper, a six-linkage robot has been presented that could be used in different areas such as medical robots. Using these robots needs a robust control. In this paper, the system equations are first found, and then the system conversion function is written. A new controller has been designed for this robot which could be used in other parallel robots and could be very useful. Parallel robots are so important in robotics because of their stability, so methods for control of them are important and the robust controller, especially in parallel robots, makes a sense.Keywords: 3-RRS, 6 linkage, parallel robot, control
Procedia PDF Downloads 1594581 Parallel Gripper Modelling and Design Optimization Using Multi-Objective Grey Wolf Optimizer
Authors: Golak Bihari Mahanta, Bibhuti Bhusan Biswal, B. B. V. L. Deepak, Amruta Rout, Gunji Balamurali
Abstract:
Robots are widely used in the manufacturing industry for rapid production with higher accuracy and precision. With the help of End-of-Arm Tools (EOATs), robots are interacting with the environment. Robotic grippers are such EOATs which help to grasp the object in an automation system for improving the efficiency. As the robotic gripper directly influence the quality of the product due to the contact between the gripper surface and the object to be grasped, it is necessary to design and optimize the gripper mechanism configuration. In this study, geometric and kinematic modeling of the parallel gripper is proposed. Grey wolf optimizer algorithm is introduced for solving the proposed multiobjective gripper optimization problem. Two objective functions developed from the geometric and kinematic modeling along with several nonlinear constraints of the proposed gripper mechanism is used to optimize the design variables of the systems. Finally, the proposed methodology compared with a previously proposed method such as Teaching Learning Based Optimization (TLBO) algorithm, NSGA II, MODE and it was seen that the proposed method is more efficient compared to the earlier proposed methodology.Keywords: gripper optimization, metaheuristics, , teaching learning based algorithm, multi-objective optimization, optimal gripper design
Procedia PDF Downloads 1884580 Parallel Querying of Distributed Ontologies with Shared Vocabulary
Authors: Sharjeel Aslam, Vassil Vassilev, Karim Ouazzane
Abstract:
Ontologies and various semantic repositories became a convenient approach for implementing model-driven architectures of distributed systems on the Web. SPARQL is the standard query language for querying such. However, although SPARQL is well-established standard for querying semantic repositories in RDF and OWL format and there are commonly used APIs which supports it, like Jena for Java, its parallel option is not incorporated in them. This article presents a complete framework consisting of an object algebra for parallel RDF and an index-based implementation of the parallel query engine capable of dealing with the distributed RDF ontologies which share common vocabulary. It has been implemented in Java, and for validation of the algorithms has been applied to the problem of organizing virtual exhibitions on the Web.Keywords: distributed ontologies, parallel querying, semantic indexing, shared vocabulary, SPARQL
Procedia PDF Downloads 2054579 Co-Evolutionary Fruit Fly Optimization Algorithm and Firefly Algorithm for Solving Unconstrained Optimization Problems
Authors: R. M. Rizk-Allah
Abstract:
This paper presents co-evolutionary fruit fly optimization algorithm based on firefly algorithm (CFOA-FA) for solving unconstrained optimization problems. The proposed algorithm integrates the merits of fruit fly optimization algorithm (FOA), firefly algorithm (FA) and elite strategy to refine the performance of classical FOA. Moreover, co-evolutionary mechanism is performed by applying FA procedures to ensure the diversity of the swarm. Finally, the proposed algorithm CFOA- FA is tested on several benchmark problems from the usual literature and the numerical results have demonstrated the superiority of the proposed algorithm for finding the global optimal solution.Keywords: firefly algorithm, fruit fly optimization algorithm, unconstrained optimization problems
Procedia PDF Downloads 536