Search results for: Parallel Processing.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2054

Search results for: Parallel Processing.

1994 Parallel Alternating Two-stage Methods for Solving Linear System

Authors: Guangbin Wang, Ning Zhang, Fuping Tan

Abstract:

In this paper, we present parallel alternating two-stage methods for solving linear system Ax = b, where A is a monotone matrix or an H-matrix. And we give some convergence results of these methods for nonsingular linear system.

Keywords: Parallel, alternating two-stage, convergence, linear system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1107
1993 Parallel Block Backward Differentiation Formulas for Solving Ordinary Differential Equations

Authors: Khairil Iskandar Othman, Zarina Bibi Ibrahim, Mohamed Suleiman

Abstract:

A parallel block method based on Backward Differentiation Formulas (BDF) is developed for the parallel solution of stiff Ordinary Differential Equations (ODEs). Most common methods for solving stiff systems of ODEs are based on implicit formulae and solved using Newton iteration which requires repeated solution of systems of linear equations with coefficient matrix, I - hβJ . Here, J is the Jacobian matrix of the problem. In this paper, the matrix operations is paralleled in order to reduce the cost of the iterations. Numerical results are given to compare the speedup and efficiency of parallel algorithm and that of sequential algorithm.

Keywords: Backward Differentiation Formula, block, ordinarydifferential equations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1956
1992 Spatial Audio Player Using Musical Genre Classification

Authors: Jun-Yong Lee, Hyoung-Gook Kim

Abstract:

In this paper, we propose a smart music player that combines the musical genre classification and the spatial audio processing. The musical genre is classified based on content analysis of the musical segment detected from the audio stream. In parallel with the classification, the spatial audio quality is achieved by adding an artificial reverberation in a virtual acoustic space to the input mono sound. Thereafter, the spatial sound is boosted with the given frequency gains based on the musical genre when played back. Experiments measured the accuracy of detecting the musical segment from the audio stream and its musical genre classification. A listening test was performed based on the virtual acoustic space based spatial audio processing.

Keywords: Automatic equalization, genre classification, music segment detection, spatial audio processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1587
1991 Some Preconditioners for Block Pentadiagonal Linear Systems Based on New Approximate Factorization Methods

Authors: Xian Ming Gu, Ting Zhu Huang, Hou Biao Li

Abstract:

In this paper, getting an high-efficiency parallel algorithm to solve sparse block pentadiagonal linear systems suitable for vectors and parallel processors, stair matrices are used to construct some parallel polynomial approximate inverse preconditioners. These preconditioners are appropriate when the desired target is to maximize parallelism. Moreover, some theoretical results about these preconditioners are presented and how to construct preconditioners effectively for any nonsingular block pentadiagonal H-matrices is also described. In addition, the availability of these preconditioners is illustrated with some numerical experiments arising from two dimensional biharmonic equation.

Keywords: Parallel algorithm, Pentadiagonal matrix, Polynomial approximate inverse, Preconditioners, Stair matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2196
1990 Study of Temperature Difference and Current Distribution in Parallel-Connected Cells at Low Temperature

Authors: Sara Kamalisiahroudi, Jun Huang, Zhe Li, Jianbo Zhang

Abstract:

Two types of commercial cylindrical lithium ion batteries (Panasonic 3.4 Ah NCR-18650B and Samsung 2.9 Ah INR-18650), were investigated experimentally. The capacities of these samples were individually measured using constant current-constant voltage (CC-CV) method at different ambient temperatures (-10°C, 0°C, 25°C). Their internal resistance was determined by electrochemical impedance spectroscopy (EIS) and pulse discharge methods. The cells with different configurations of parallel connection NCR-NCR, INR-INR and NCR-INR were charged/discharged at the aforementioned ambient temperatures. The results showed that the difference of internal resistance between cells much more evident at low temperatures. Furthermore, the parallel connection of NCR-NCR exhibits the most uniform temperature distribution in cells at -10°C, this feature is quite favorable for the safety of the battery pack.

Keywords: Batteries in parallel connection, internal resistance, low temperature, temperature difference, current distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3165
1989 Application and Limitation of Parallel Modelingin Multidimensional Sequential Pattern

Authors: Mahdi Esmaeili, Mansour Tarafdar

Abstract:

The goal of data mining algorithms is to discover useful information embedded in large databases. One of the most important data mining problems is discovery of frequently occurring patterns in sequential data. In a multidimensional sequence each event depends on more than one dimension. The search space is quite large and the serial algorithms are not scalable for very large datasets. To address this, it is necessary to study scalable parallel implementations of sequence mining algorithms. In this paper, we present a model for multidimensional sequence and describe a parallel algorithm based on data parallelism. Simulation experiments show good load balancing and scalable and acceptable speedup over different processors and problem sizes and demonstrate that our approach can works efficiently in a real parallel computing environment.

Keywords: Sequential Patterns, Data Mining, ParallelAlgorithm, Multidimensional Sequence Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1440
1988 A High Performance MPI for Parallel and Distributed Computing

Authors: Prabu D., Vanamala V., Sanjeeb Kumar Deka, Sridharan R., Prahlada Rao B. B., Mohanram N.

Abstract:

Message Passing Interface is widely used for Parallel and Distributed Computing. MPICH and LAM are popular open source MPIs available to the parallel computing community also there are commercial MPIs, which performs better than MPICH etc. In this paper, we discuss a commercial Message Passing Interface, CMPI (C-DAC Message Passing Interface). C-MPI is an optimized MPI for CLUMPS. It is found to be faster and more robust compared to MPICH. We have compared performance of C-MPI and MPICH on Gigabit Ethernet network.

Keywords: C-MPI, C-VIA, HPC, MPICH, P-COMS, PMB

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1512
1987 Solving Facility Location Problem on Cluster Computing

Authors: Ei Phyo Wai, Nay Min Tun

Abstract:

Computation of facility location problem for every location in the country is not easy simultaneously. Solving the problem is described by using cluster computing. A technique is to design parallel algorithm by using local search with single swap method in order to solve that problem on clusters. Parallel implementation is done by the use of portable parallel programming, Message Passing Interface (MPI), on Microsoft Windows Compute Cluster. In this paper, it presents the algorithm that used local search with single swap method and implementation of the system of a facility to be opened by using MPI on cluster. If large datasets are considered, the process of calculating a reasonable cost for a facility becomes time consuming. The result shows parallel computation of facility location problem on cluster speedups and scales well as problem size increases.

Keywords: cluster, cost, demand, facility location

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1443
1986 New VLSI Architecture for Motion Estimation Algorithm

Authors: V. S. K. Reddy, S. Sengupta, Y. M. Latha

Abstract:

This paper presents an efficient VLSI architecture design to achieve real time video processing using Full-Search Block Matching (FSBM) algorithm. The design employs parallel bank architecture with minimum latency, maximum throughput, and full hardware utilization. We use nine parallel processors in our architecture and each controlled by a state machine. State machine control implementation makes the design very simple and cost effective. The design is implemented using VHDL and the programming techniques we incorporated makes the design completely programmable in the sense that the search ranges and the block sizes can be varied to suit any given requirements. The design can operate at frequencies up to 36 MHz and it can function in QCIF and CIF video resolution at 1.46 MHz and 5.86 MHz, respectively.

Keywords: Video Coding, Motion Estimation, Full-Search, Block-Matching, VLSI Architecture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1767
1985 Security over OFDM Fading Channels with Friendly Jammer

Authors: Munnujahan Ara

Abstract:

In this paper, we investigate the effect of friendly jamming power allocation strategies on the achievable average secrecy rate over a bank of parallel fading wiretap channels. We investigate the achievable average secrecy rate in parallel fading wiretap channels subject to Rayleigh and Rician fading. The achievable average secrecy rate, due to the presence of a line-of-sight component in the jammer channel is also evaluated. Moreover, we study the detrimental effect of correlation across the parallel sub-channels, and evaluate the corresponding decrease in the achievable average secrecy rate for the various fading configurations. We also investigate the tradeoff between the transmission power and the jamming power for a fixed total power budget. Our results, which are applicable to current orthogonal frequency division multiplexing (OFDM) communications systems, shed further light on the achievable average secrecy rates over a bank of parallel fading channels in the presence of friendly jammers.

Keywords: Fading parallel channels, Wire-tap channel, OFDM, Secrecy capacity, Power allocation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2210
1984 Strip Decomposition Parallelization of Fast Direct Poisson Solver on a 3D Cartesian Staggered Grid

Authors: Minh Vuong Pham, Frédéric Plourde, Son Doan Kim

Abstract:

A strip domain decomposition parallel algorithm for fast direct Poisson solver is presented on a 3D Cartesian staggered grid. The parallel algorithm follows the principles of sequential algorithm for fast direct Poisson solver. Both Dirichlet and Neumann boundary conditions are addressed. Several test cases are likewise addressed in order to shed light on accuracy and efficiency in the strip domain parallelization algorithm. Actually the current implementation shows a very high efficiency when dealing with a large grid mesh up to 3.6 * 109 under massive parallel approach, which explicitly demonstrates that the proposed algorithm is ready for massive parallel computing.

Keywords: Strip-decomposition, parallelization, fast directpoisson solver.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2008
1983 Parallelization of Ensemble Kalman Filter (EnKF) for Oil Reservoirs with Time-lapse Seismic Data

Authors: Md Khairullah, Hai-Xiang Lin, Remus G. Hanea, Arnold W. Heemink

Abstract:

In this paper we describe the design and implementation of a parallel algorithm for data assimilation with ensemble Kalman filter (EnKF) for oil reservoir history matching problem. The use of large number of observations from time-lapse seismic leads to a large turnaround time for the analysis step, in addition to the time consuming simulations of the realizations. For efficient parallelization it is important to consider parallel computation at the analysis step. Our experiments show that parallelization of the analysis step in addition to the forecast step has good scalability, exploiting the same set of resources with some additional efforts.

Keywords: EnKF, Data assimilation, Parallel computing, Parallel efficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2226
1982 Efficient Implementation of Serial and Parallel Support Vector Machine Training with a Multi-Parameter Kernel for Large-Scale Data Mining

Authors: Tatjana Eitrich, Bruno Lang

Abstract:

This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.

Keywords: Support Vector Machines, Shared Memory Parallel Computing, Large Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1540
1981 Forward Kinematics Analysis of a 3-PRS Parallel Manipulator

Authors: Ghasem Abbasnejad, Soheil Zarkandi, Misagh Imani

Abstract:

In this article the homotopy continuation method (HCM) to solve the forward kinematic problem of the 3-PRS parallel manipulator is used. Since there are many difficulties in solving the system of nonlinear equations in kinematics of manipulators, the numerical solutions like Newton-Raphson are inevitably used. When dealing with any numerical solution, there are two troublesome problems. One is that good initial guesses are not easy to detect and another is related to whether the used method will converge to useful solutions. Results of this paper reveal that the homotopy continuation method can alleviate the drawbacks of traditional numerical techniques.

Keywords: Forward kinematics, Homotopy continuationmethod, Parallel manipulators, Rotation matrix

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2610
1980 Fast Painting with Different Colors Using Cross Correlation in the Frequency Domain

Authors: Hazem M. El-Bakry

Abstract:

In this paper, a new technique for fast painting with different colors is presented. The idea of painting relies on applying masks with different colors to the background. Fast painting is achieved by applying these masks in the frequency domain instead of spatial (time) domain. New colors can be generated automatically as a result from the cross correlation operation. This idea was applied successfully for faster specific data (face, object, pattern, and code) detection using neural algorithms. Here, instead of performing cross correlation between the input input data (e.g., image, or a stream of sequential data) and the weights of neural networks, the cross correlation is performed between the colored masks and the background. Furthermore, this approach is developed to reduce the computation steps required by the painting operation. The principle of divide and conquer strategy is applied through background decomposition. Each background is divided into small in size subbackgrounds and then each sub-background is processed separately by using a single faster painting algorithm. Moreover, the fastest painting is achieved by using parallel processing techniques to paint the resulting sub-backgrounds using the same number of faster painting algorithms. In contrast to using only faster painting algorithm, the speed up ratio is increased with the size of the background when using faster painting algorithm and background decomposition. Simulation results show that painting in the frequency domain is faster than that in the spatial domain.

Keywords: Fast Painting, Cross Correlation, Frequency Domain, Parallel Processing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1742
1979 Massively-Parallel Bit-Serial Neural Networks for Fast Epilepsy Diagnosis: A Feasibility Study

Authors: Si Mon Kueh, Tom J. Kazmierski

Abstract:

There are about 1% of the world population suffering from the hidden disability known as epilepsy and major developing countries are not fully equipped to counter this problem. In order to reduce the inconvenience and danger of epilepsy, different methods have been researched by using a artificial neural network (ANN) classification to distinguish epileptic waveforms from normal brain waveforms. This paper outlines the aim of achieving massive ANN parallelization through a dedicated hardware using bit-serial processing. The design of this bit-serial Neural Processing Element (NPE) is presented which implements the functionality of a complete neuron using variable accuracy. The proposed design has been tested taking into consideration non-idealities of a hardware ANN. The NPE consists of a bit-serial multiplier which uses only 16 logic elements on an Altera Cyclone IV FPGA and a bit-serial ALU as well as a look-up table. Arrays of NPEs can be driven by a single controller which executes the neural processing algorithm. In conclusion, the proposed compact NPE design allows the construction of complex hardware ANNs that can be implemented in a portable equipment that suits the needs of a single epileptic patient in his or her daily activities to predict the occurrences of impending tonic conic seizures.

Keywords: Artificial Neural Networks, bit-serial neural processor, FPGA, Neural Processing Element.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1528
1978 Development of Circulating Support Environment of Multilingual Medical Communication using Parallel Texts for Foreign Patients

Authors: Mai Miyabe, Taku Fukushima, Takashi Yoshino, Aguri Shigeno

Abstract:

The need for multilingual communication in Japan has increased due to an increase in the number of foreigners in the country. When people communicate in their nonnative language, the differences in language prevent mutual understanding among the communicating individuals. In the medical field, communication between the hospital staff and patients is a serious problem. Currently, medical translators accompany patients to medical care facilities, and the demand for medical translators is increasing. However, medical translators cannot necessarily provide support, especially in cases in which round-the-clock support is required or in case of emergencies. The medical field has high expectations from information technology. Hence, a system that supports accurate multilingual communication is required. Despite recent advances in machine translation technology, it is very difficult to obtain highly accurate translations. We have developed a support system called M3 for multilingual medical reception. M3 provides support functions that aid foreign patients in the following respects: conversation, questionnaires, reception procedures, and hospital navigation; it also has a Q&A function. Users can operate M3 using a touch screen and receive text-based support. In addition, M3 uses accurate translation tools called parallel texts to facilitate reliable communication through conversations between the hospital staff and the patients. However, if there is no parallel text that expresses what users want to communicate, the users cannot communicate. In this study, we have developed a circulating support environment for multilingual medical communication using parallel texts. The proposed environment can circulate necessary parallel texts through the following procedure: (1) a user provides feedback about the necessary parallel texts, following which (2) these parallel texts are created and evaluated.

Keywords: multilingual medical communication, parallel texts.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1441
1977 GPU-Accelerated Triangle Mesh Simplification Using Parallel Vertex Removal

Authors: Thomas Odaker, Dieter Kranzlmueller, Jens Volkert

Abstract:

We present an approach to triangle mesh simplification designed to be executed on the GPU. We use a quadric error metric to calculate an error value for each vertex of the mesh and order all vertices based on this value. This step is followed by the parallel removal of a number of vertices with the lowest calculated error values. To allow for the parallel removal of multiple vertices we use a set of per-vertex boundaries that prevent mesh foldovers even when simplification operations are performed on neighbouring vertices. We execute multiple iterations of the calculation of the vertex errors, ordering of the error values and removal of vertices until either a desired number of vertices remains in the mesh or a minimum error value is reached. This parallel approach is used to speed up the simplification process while maintaining mesh topology and avoiding foldovers at every step of the simplification.

Keywords: Computer graphics, half edge collapse, mesh simplification, precomputed simplification, topology preserving.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2742
1976 Motion Protection System Design for a Parallel Motion Platform

Authors: Dongsu Wu, Hongbin Gu

Abstract:

A motion protection system is designed for a parallel motion platform with subsided cabin. Due to its complex structure, parallel mechanism is easy to encounter interference problems including link length limits, joints limits and self-collision. Thus a virtual spring algorithm in operational space is developed for the motion protection system to avoid potential damages caused by interference. Simulation results show that the proposed motion protection system can effectively eliminate interference problems and ensure safety of the whole motion platform.

Keywords: Motion protection, motion platform, parallelmechanism, Stewart platform, collision avoidance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1514
1975 Fast Database Indexing for Large Protein Sequence Collections Using Parallel N-Gram Transformation Algorithm

Authors: Jehad A. H. Hammad, Nur'Aini binti Abdul Rashid

Abstract:

With the rapid development in the field of life sciences and the flooding of genomic information, the need for faster and scalable searching methods has become urgent. One of the approaches that were investigated is indexing. The indexing methods have been categorized into three categories which are the lengthbased index algorithms, transformation-based algorithms and mixed techniques-based algorithms. In this research, we focused on the transformation based methods. We embedded the N-gram method into the transformation-based method to build an inverted index table. We then applied the parallel methods to speed up the index building time and to reduce the overall retrieval time when querying the genomic database. Our experiments show that the use of N-Gram transformation algorithm is an economical solution; it saves time and space too. The result shows that the size of the index is smaller than the size of the dataset when the size of N-Gram is 5 and 6. The parallel N-Gram transformation algorithm-s results indicate that the uses of parallel programming with large dataset are promising which can be improved further.

Keywords: Biological sequence, Database index, N-gram indexing, Parallel computing, Sequence retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2081
1974 Detecting the Edge of Multiple Images in Parallel

Authors: Prakash K. Aithal, U. Dinesh Acharya, Rajesh Gopakumar

Abstract:

Edge is variation of brightness in an image. Edge detection is useful in many application areas such as finding forests, rivers from a satellite image, detecting broken bone in a medical image etc. The paper discusses about finding edge of multiple aerial images in parallel. The proposed work tested on 38 images 37 colored and one monochrome image. The time taken to process N images in parallel is equivalent to time taken to process 1 image in sequential. Message Passing Interface (MPI) and Open Computing Language (OpenCL) is used to achieve task and pixel level parallelism respectively.

Keywords: Edge detection, multicore, GPU, openCL, MPI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2302
1973 Singularity Loci of Actuation Schemes for 3RRR Planar Parallel Manipulator

Authors: S. Ramana Babu, V. Ramachandra Raju, K. Ramji

Abstract:

This paper presents the effect of actuation schemes on the performance of parallel manipulators and also how the singularity loci have been changed in the reachable workspace of the manipulator with the choice of actuation scheme to drive the manipulator. The performance of the eight possible actuation schemes of 3RRR planar parallel manipulator is compared with each other. The optimal design problem is formulated to find the manipulator geometry that maximizes the singularity free conditioned workspace for all the eight actuation cases, the optimization problem is solved by using genetic algorithms.

Keywords: Actuation schemes, GCI, genetic algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1562
1972 Unrelated Parallel Machines Scheduling Problem Using an Ant Colony Optimization Approach

Authors: Y. K. Lin, H. T. Hsieh, F. Y. Hsieh

Abstract:

Total weighted tardiness is a measure of customer satisfaction. Minimizing it represents satisfying the general requirement of on-time delivery. In this research, we consider an ant colony optimization (ACO) algorithm to solve the problem of scheduling unrelated parallel machines to minimize total weighted tardiness. The problem is NP-hard in the strong sense. Computational results show that the proposed ACO algorithm is giving promising results compared to other existing algorithms.

Keywords: ant colony optimization, total weighted tardiness, unrelated parallel machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1857
1971 Sub-Image Detection Using Fast Neural Processors and Image Decomposition

Authors: Hazem M. El-Bakry, Qiangfu Zhao

Abstract:

In this paper, an approach to reduce the computation steps required by fast neural networksfor the searching process is presented. The principle ofdivide and conquer strategy is applied through imagedecomposition. Each image is divided into small in sizesub-images and then each one is tested separately usinga fast neural network. The operation of fast neuralnetworks based on applying cross correlation in thefrequency domain between the input image and theweights of the hidden neurons. Compared toconventional and fast neural networks, experimentalresults show that a speed up ratio is achieved whenapplying this technique to locate human facesautomatically in cluttered scenes. Furthermore, fasterface detection is obtained by using parallel processingtechniques to test the resulting sub-images at the sametime using the same number of fast neural networks. Incontrast to using only fast neural networks, the speed upratio is increased with the size of the input image whenusing fast neural networks and image decomposition.

Keywords: Fast Neural Networks, 2D-FFT, CrossCorrelation, Image decomposition, Parallel Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2127
1970 Bounds on Reliability of Parallel Computer Interconnection Systems

Authors: Ranjan Kumar Dash, Chita Ranjan Tripathy

Abstract:

The evaluation of residual reliability of large sized parallel computer interconnection systems is not practicable with the existing methods. Under such conditions, one must go for approximation techniques which provide the upper bound and lower bound on this reliability. In this context, a new approximation method for providing bounds on residual reliability is proposed here. The proposed method is well supported by two algorithms for simulation purpose. The bounds on residual reliability of three different categories of interconnection topologies are efficiently found by using the proposed method

Keywords: Parallel computer network, reliability, probabilisticgraph, interconnection networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1299
1969 Minimizing Makespan Subject to Budget Limitation in Parallel Flow Shop

Authors: Amin Sahraeian

Abstract:

One of the criteria in production scheduling is Make Span, minimizing this criteria causes more efficiently use of the resources specially machinery and manpower. By assigning some budget to some of the operations the operation time of these activities reduces and affects the total completion time of all the operations (Make Span). In this paper this issue is practiced in parallel flow shops. At first we convert parallel flow shop to a network model and by using a linear programming approach it is identified in order to minimize make span (the completion time of the network) which activities (operations) are better to absorb the predetermined and limited budget. Minimizing the total completion time of all the activities in the network is equivalent to minimizing make span in production scheduling.

Keywords: parallel flow shop, make span, linear programming, budget

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1741
1968 High Level Synthesis of Kahn Process Networks(KPN) for Streaming Applications

Authors: Attiya Mahmood, Syed Ali Abbas, Shoab A. Khan

Abstract:

Streaming Applications usually run in parallel or in series that incrementally transform a stream of input data. It poses a design challenge to break such an application into distinguishable blocks and then to map them into independent hardware processing elements. For this, there is required a generic controller that automatically maps such a stream of data into independent processing elements without any dependencies and manual considerations. In this paper, Kahn Process Networks (KPN) for such streaming applications is designed and developed that will be mapped on MPSoC. This is designed in such a way that there is a generic Cbased compiler that will take the mapping specifications as an input from the user and then it will automate these design constraints and automatically generate the synthesized RTL optimized code for specified application.

Keywords: KPN, DFG, FPGA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1780
1967 Novel Ridge Orientation Based Approach for Fingerprint Identification Using Co-Occurrence Matrix

Authors: Mehran Yazdi, Zahra Adelpour, Batoul Bahraini, Yasaman Keshtkar Jahromi

Abstract:

In this paper we use the property of co-occurrence matrix in finding parallel lines in binary pictures for fingerprint identification. In our proposed algorithm, we reduce the noise by filtering the fingerprint images and then transfer the fingerprint images to binary images using a proper threshold. Next, we divide the binary images into some regions having parallel lines in the same direction. The lines in each region have a specific angle that can be used for comparison. This method is simple, performs the comparison step quickly and has a good resistance in the presence of the noise.

Keywords: Parallel lines detection, co-occurrence matrix, fingerprint identification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1318
1966 A New High Speed Neural Model for Fast Character Recognition Using Cross Correlation and Matrix Decomposition

Authors: Hazem M. El-Bakry

Abstract:

Neural processors have shown good results for detecting a certain character in a given input matrix. In this paper, a new idead to speed up the operation of neural processors for character detection is presented. Such processors are designed based on cross correlation in the frequency domain between the input matrix and the weights of neural networks. This approach is developed to reduce the computation steps required by these faster neural networks for the searching process. The principle of divide and conquer strategy is applied through image decomposition. Each image is divided into small in size sub-images and then each one is tested separately by using a single faster neural processor. Furthermore, faster character detection is obtained by using parallel processing techniques to test the resulting sub-images at the same time using the same number of faster neural networks. In contrast to using only faster neural processors, the speed up ratio is increased with the size of the input image when using faster neural processors and image decomposition. Moreover, the problem of local subimage normalization in the frequency domain is solved. The effect of image normalization on the speed up ratio of character detection is discussed. Simulation results show that local subimage normalization through weight normalization is faster than subimage normalization in the spatial domain. The overall speed up ratio of the detection process is increased as the normalization of weights is done off line.

Keywords: Fast Character Detection, Neural Processors, Cross Correlation, Image Normalization, Parallel Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1492
1965 Analyzing the Factors that Cause Parallel Performance Degradation in Parallel Graph-Based Computations Using Graph500

Authors: Mustafa Elfituri, Jonathan Cook

Abstract:

Recently, graph-based computations have become more important in large-scale scientific computing as they can provide a methodology to model many types of relations between independent objects. They are being actively used in fields as varied as biology, social networks, cybersecurity, and computer networks. At the same time, graph problems have some properties such as irregularity and poor locality that make their performance different than regular applications performance. Therefore, parallelizing graph algorithms is a hard and challenging task. Initial evidence is that standard computer architectures do not perform very well on graph algorithms. Little is known exactly what causes this. The Graph500 benchmark is a representative application for parallel graph-based computations, which have highly irregular data access and are driven more by traversing connected data than by computation. In this paper, we present results from analyzing the performance of various example implementations of Graph500, including a shared memory (OpenMP) version, a distributed (MPI) version, and a hybrid version. We measured and analyzed all the factors that affect its performance in order to identify possible changes that would improve its performance. Results are discussed in relation to what factors contribute to performance degradation.

Keywords: Graph computation, Graph500 benchmark, parallel architectures, parallel programming, workload characterization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 477