Search results for: Parallel Pipeline Router Architecture
1536 Pipelined Control-Path Effects on Area and Performance of a Wormhole-Switched Network-on-Chip
Authors: Faizal A. Samman, Thomas Hollstein, Manfred Glesner
Abstract:
This paper presents design trade-off and performance impacts of the amount of pipeline phase of control path signals in a wormhole-switched network-on-chip (NoC). The numbers of the pipeline phase of the control path vary between two- and one-cycle pipeline phase. The control paths consist of the routing request paths for output selection and the arbitration paths for input selection. Data communications between on-chip routers are implemented synchronously and for quality of service, the inter-router data transports are controlled by using a link-level congestion control to avoid lose of data because of an overflow. The trade-off between the area (logic cell area) and the performance (bandwidth gain) of two proposed NoC router microarchitectures are presented in this paper. The performance evaluation is made by using a traffic scenario with different number of workloads under 2D mesh NoC topology using a static routing algorithm. By using a 130-nm CMOS standard-cell technology, our NoC routers can be clocked at 1 GHz, resulting in a high speed network link and high router bandwidth capacity of about 320 Gbit/s. Based on our experiments, the amount of control path pipeline stages gives more significant impact on the NoC performance than the impact on the logic area of the NoC router.Keywords: Network-on-Chip, Synchronous Parallel Pipeline, Router Architecture, Wormhole Switching
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14831535 FPGA Hardware Implementation and Evaluation of a Micro-Network Architecture for Multi-Core Systems
Authors: Yahia Salah, Med Lassaad Kaddachi, Rached Tourki
Abstract:
This paper presents the design, implementation and evaluation of a micro-network, or Network-on-Chip (NoC), based on a generic pipeline router architecture. The router is designed to efficiently support traffic generated by multimedia applications on embedded multi-core systems. It employs a simplest routing mechanism and implements the round-robin scheduling strategy to resolve output port contentions and minimize latency. A virtual channel flow control is applied to avoid the head-of-line blocking problem and enhance performance in the NoC. The hardware design of the router architecture has been implemented at the register transfer level; its functionality is evaluated in the case of the two dimensional Mesh/Torus topology, and performance results are derived from ModelSim simulator and Xilinx ISE 9.2i synthesis tool. An example of a multi-core image processing system utilizing the NoC structure has been implemented and validated to demonstrate the capability of the proposed micro-network architecture. To reduce complexity of the image compression and decompression architecture, the system use image processing algorithm based on classical discrete cosine transform with an efficient zonal processing approach. The experimental results have confirmed that both the proposed image compression scheme and NoC architecture can achieve a reasonable image quality with lower processing time.
Keywords: Generic Pipeline Network-on-Chip Router Architecture, JPEG Image Compression, FPGA Hardware Implementation, Performance Evaluation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30971534 A Generic and Extensible Spidergon NoC
Authors: Abdelkrim Zitouni, Mounir Zid, Sami Badrouchi, Rached Tourki
Abstract:
The Globally Asynchronous Locally Synchronous Network on Chip (GALS NoC) is the most efficient solution that provides low latency transfers and power efficient System on Chip (SoC) interconnect. This study presents a GALS and generic NoC architecture based on a configurable router. This router integrates a sophisticated dynamic arbiter, the wormhole routing technique and can be configured in a manner that allows it to be used in many possible NoC topologies such as Mesh 2-D, Tree and Polygon architectures. This makes it possible to improve the quality of service (QoS) required by the proposed NoC. A comparative performances study of the proposed NoC architecture, Tore architecture and of the most used Mesh 2D architecture is performed. This study shows that Spidergon architecture is characterised by the lower latency and the later saturation. It is also shown that no matter what the number of used links is raised; the Links×Diameter product permitted by the Spidergon architecture remains always the lower. The only limitation of this architecture comes from it-s over cost in term of silicon area.
Keywords: Dynamic arbiter, Generic router, Spidergon NoC, SoC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15701533 A Parallel Architecture for the Real Time Correction of Stereoscopic Images
Authors: Zohir Irki, Michel Devy
Abstract:
In this paper, we will present an architecture for the implementation of a real time stereoscopic images correction's approach. This architecture is parallel and makes use of several memory blocs in which are memorized pre calculated data relating to the cameras used for the acquisition of images. The use of reduced images proves to be essential in the proposed approach; the suggested architecture must so be able to carry out the real time reduction of original images.Keywords: Image reduction, Real-time correction, Parallel architecture, Parallel treatment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11071532 Security Engine Management of Router based on Security Policy
Authors: Su Hyung Jo, Ki Young Kim, Sang Ho Lee
Abstract:
Security management has changed from the management of security equipments and useful interface to manager. It analyzes the whole security conditions of network and preserves the network services from attacks. Secure router technology has security functions, such as intrusion detection, IPsec(IP Security) and access control, are applied to legacy router for secure networking. It controls an unauthorized router access and detects an illegal network intrusion. This paper relates to a security engine management of router based on a security policy, which is the definition of security function against a network intrusion. This paper explains the security policy and designs the structure of security engine management framework.Keywords: Policy server, security engine, security management, security policy
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19211531 High-Speed Pipeline Implementation of Radix-2 DIF Algorithm
Authors: Christos Meletis, Paul Bougas, George Economakos , Paraskevas Kalivas, Kiamal Pekmestzi
Abstract:
In this paper, we propose a new architecture for the implementation of the N-point Fast Fourier Transform (FFT), based on the Radix-2 Decimation in Frequency algorithm. This architecture is based on a pipeline circuit that can process a stream of samples and produce two FFT transform samples every clock cycle. Compared to existing implementations the architecture proposed achieves double processing speed using the same circuit complexity.
Keywords: Digital signal processing, systolic circuits, FFTalgorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22171530 A Subjectively Influenced Router for Vehicles in a Four-Junction Traffic System
Authors: Anilkumar Kothalil Gopalakrishnan
Abstract:
A subjectively influenced router for vehicles in a fourjunction traffic system is presented. The router is based on a 3-layer Backpropagation Neural Network (BPNN) and a greedy routing procedure. The BPNN detects priorities of vehicles based on the subjective criteria. The subjective criteria and the routing procedure depend on the routing plan towards vehicles depending on the user. The routing procedure selects vehicles from their junctions based on their priorities and route them concurrently to the traffic system. That is, when the router is provided with a desired vehicles selection criteria and routing procedure, it routes vehicles with a reasonable junction clearing time. The cost evaluation of the router determines its efficiency. In the case of a routing conflict, the router will route the vehicles in a consecutive order and quarantine faulty vehicles. The simulations presented indicate that the presented approach is an effective strategy of structuring a subjective vehicle router.Keywords: Backpropagation Neural Network, Backpropagationalgorithm, Greedy routing procedure, Subjective criteria, Vehiclepriority, Cost evaluation, Route generation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13921529 Characteristics Analysis of Thermal Resistance of Cryogenic Pipeline in Vacuum Environment
Authors: Wang Zijuan, Ding Wenjing, Liu Ran
Abstract:
If an unsteady heat transfer or heat impulse happens in part of the cryogenic pipeline system of large space environment simulation equipment while running in vacuum environment, it will lead to abnormal flow of the cryogenic fluid in the pipeline. When the situation gets worse, the cryogenic fluid in the pipeline will have phase change and a gas block which results in the malfunction of the cryogenic pipeline system. Referring to the structural parameter of a typical cryogenic pipeline system and the basic equation, an analytical model and a calculation model for cryogenic pipeline system can be built. The various factors which influence the thermal resistance of a cryogenic pipeline system can be analyzed and calculated by using the qualitative analysis relation deduced for thermal resistance of pipeline. The research conclusion could provide theoretical support for the design and operation of a cryogenic pipeline systemKeywords: pipeline, vacuum, vapor quality
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19031528 Implementing High Performance VPN Router using Cavium-s CN2560 Security Processor
Authors: Sang Su Lee, Sang Woo Lee, Yong Sung Jeon, Ki Young Kim
Abstract:
IPsec protocol[1] is a set of security extensions developed by the IETF and it provides privacy and authentication services at the IP layer by using modern cryptography. In this paper, we describe both of H/W and S/W architectures of our router system, SRS-10. The system is designed to support high performance routing and IPsec VPN. Especially, we used Cavium-s CN2560 processor to implement IPsec processing in inline-mode.Keywords: IP, router, VPN, IPsec.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20341527 Design of High-speed Modified Booth Multipliers Operating at GHz Ranges
Authors: Soojin Kim, Kyeongsoon Cho
Abstract:
This paper describes the pipeline architecture of high-speed modified Booth multipliers. The proposed multiplier circuits are based on the modified Booth algorithm and the pipeline technique which are the most widely used to accelerate the multiplication speed. In order to implement the optimally pipelined multipliers, many kinds of experiments have been conducted. The speed of the multipliers is greatly improved by properly deciding the number of pipeline stages and the positions for the pipeline registers to be inserted. We described the proposed modified Booth multiplier circuits in Verilog HDL and synthesized the gate-level circuits using 0.13um standard cell library. The resultant multiplier circuits show better performance than others. Since the proposed multipliers operate at GHz ranges, they can be used in the systems requiring very high performance.Keywords: multiplier, pipeline, high-speed, modified Boothalgorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27291526 Concurrent Approach to Data Parallel Model using Java
Authors: Bala Dhandayuthapani Veerasamy
Abstract:
Parallel programming models exist as an abstraction of hardware and memory architectures. There are several parallel programming models in commonly use; they are shared memory model, thread model, message passing model, data parallel model, hybrid model, Flynn-s models, embarrassingly parallel computations model, pipelined computations model. These models are not specific to a particular type of machine or memory architecture. This paper expresses the model program for concurrent approach to data parallel model through java programming.Keywords: Concurrent, Data Parallel, JDK, Parallel, Thread
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20971525 CPU Architecture Based on Static Hardware Scheduler Engine and Multiple Pipeline Registers
Authors: Ionel Zagan, Vasile Gheorghita Gaitan
Abstract:
The development of CPUs and of real-time systems based on them made it possible to use time at increasingly low resolutions. Together with the scheduling methods and algorithms, time organizing has been improved so as to respond positively to the need for optimization and to the way in which the CPU is used. This presentation contains both a detailed theoretical description and the results obtained from research on improving the performances of the nMPRA (Multi Pipeline Register Architecture) processor by implementing specific functions in hardware. The proposed CPU architecture has been developed, simulated and validated by using the FPGA Virtex-7 circuit, via a SoC project. Although the nMPRA processor hardware structure with five pipeline stages is very complex, the present paper presents and analyzes the tests dedicated to the implementation of the CPU and of the memory on-chip for instructions and data. In order to practically implement and test the entire SoC project, various tests have been performed. These tests have been performed in order to verify the drivers for peripherals and the boot module named Bootloader.
Keywords: Hardware scheduler, nMPRA processor, real-time systems, scheduling methods.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10961524 Using of Latin Router for Routing Wavelength with Configuration Algorithm
Authors: A. Habiboghli, R. Mostafaei, M. R.Meybodi
Abstract:
Optical network uses a tool for routing which is called Latin router. These routers use particular algorithms for routing. In this paper, we present algorithm for configuration of optical network that is optimized regarding previous algorithm. We show that by decreasing the number of hops for source-destination in lightpath number of satisfied request is less. Also we had shown that more than single-hop lightpath relating single-hop lightpath is better.Keywords: Latin Router, Constraint Satisfied, Wavelength, Optical Network
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14751523 Flexible Wormhole-Switched Network-on-chip with Two-Level Priority Data Delivery Service
Authors: Faizal A. Samman, Thomas Hollstein, Manfred Glesner
Abstract:
A synchronous network-on-chip using wormhole packet switching and supporting guaranteed-completion best-effort with low-priority (LP) and high-priority (HP) wormhole packet delivery service is presented in this paper. Both our proposed LP and HP message services deliver a good quality of service in term of lossless packet completion and in-order message data delivery. However, the LP message service does not guarantee minimal completion bound. The HP packets will absolutely use 100% bandwidth of their reserved links if the HP packets are injected from the source node with maximum injection. Hence, the service are suitable for small size messages (less than hundred bytes). Otherwise the other HP and LP messages, which require also the links, will experience relatively high latency depending on the size of the HP message. The LP packets are routed using a minimal adaptive routing, while the HP packets are routed using a non-minimal adaptive routing algorithm. Therefore, an additional 3-bit field, identifying the packet type, is introduced in their packet headers to classify and to determine the type of service committed to the packet. Our NoC prototypes have been also synthesized using a 180-nm CMOS standard-cell technology to evaluate the cost of implementing the combination of both services.Keywords: Network-on-Chip, Parallel Pipeline Router Architecture, Wormhole Switching, Two-Level Priority Service.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17661522 Effect of Corrosion on Hydrocarbon Pipelines
Authors: Madjid Meriem-Benziane, Hamou Zahloul
Abstract:
The demand of hydrocarbons has increased the construction of pipelines and the protection of the physical and mechanical integrity of the already existing infrastructure. Corrosion is the main reason of failures in the pipeline and it is mostly produced by acid (HCOOCH3). In this basis, a CFD code was used, in order to study the corrosion of internal wall of hydrocarbons pipeline. In this situation, the corrosion phenomenon shows a growing deposit, which causes defect damages (welding or fabrication) at diverse positions along the pipeline. The solution of the pipeline corrosion is based on the diminution of the Naphthenic acid.Keywords: Pipeline, corrosion, Naphthenic acid (NA), CFD.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33511521 A Survey on Performance Tools for OpenMP
Authors: Mubarak S. Mohsen, Rosni Abdullah, Yong M. Teo
Abstract:
Advances in processors architecture, such as multicore, increase the size of complexity of parallel computer systems. With multi-core architecture there are different parallel languages that can be used to run parallel programs. One of these languages is OpenMP which embedded in C/Cµ or FORTRAN. Because of this new architecture and the complexity, it is very important to evaluate the performance of OpenMP constructs, kernels, and application program on multi-core systems. Performance is the activity of collecting the information about the execution characteristics of a program. Performance tools consists of at least three interfacing software layers, including instrumentation, measurement, and analysis. The instrumentation layer defines the measured performance events. The measurement layer determines what performance event is actually captured and how it is measured by the tool. The analysis layer processes the performance data and summarizes it into a form that can be displayed in performance tools. In this paper, a number of OpenMP performance tools are surveyed, explaining how each is used to collect, analyse, and display data collection.Keywords: Parallel performance tools, OpenMP, multi-core.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19231520 Low Power and Less Area Architecture for Integer Motion Estimation
Authors: C Hisham, K Komal, Amit K Mishra
Abstract:
Full search block matching algorithm is widely used for hardware implementation of motion estimators in video compression algorithms. In this paper we are proposing a new architecture, which consists of a 2D parallel processing unit and a 1D unit both working in parallel. The proposed architecture reduces both data access power and computational power which are the main causes of power consumption in integer motion estimation. It also completes the operations with nearly the same number of clock cycles as compared to a 2D systolic array architecture. In this work sum of absolute difference (SAD)-the most repeated operation in block matching, is calculated in two steps. The first step is to calculate the SAD for alternate rows by a 2D parallel unit. If the SAD calculated by the parallel unit is less than the stored minimum SAD, the SAD of the remaining rows is calculated by the 1D unit. Early termination, which stops avoidable computations has been achieved with the help of alternate rows method proposed in this paper and by finding a low initial SAD value based on motion vector prediction. Data reuse has been applied to the reference blocks in the same search area which significantly reduced the memory access.
Keywords: Sum of absolute difference, high speed DSP.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14931519 New VLSI Architecture for Motion Estimation Algorithm
Authors: V. S. K. Reddy, S. Sengupta, Y. M. Latha
Abstract:
This paper presents an efficient VLSI architecture design to achieve real time video processing using Full-Search Block Matching (FSBM) algorithm. The design employs parallel bank architecture with minimum latency, maximum throughput, and full hardware utilization. We use nine parallel processors in our architecture and each controlled by a state machine. State machine control implementation makes the design very simple and cost effective. The design is implemented using VHDL and the programming techniques we incorporated makes the design completely programmable in the sense that the search ranges and the block sizes can be varied to suit any given requirements. The design can operate at frequencies up to 36 MHz and it can function in QCIF and CIF video resolution at 1.46 MHz and 5.86 MHz, respectively.Keywords: Video Coding, Motion Estimation, Full-Search, Block-Matching, VLSI Architecture.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18071518 JConqurr - A Multi-Core Programming Toolkit for Java
Authors: G.A.C.P. Ganegoda, D.M.A. Samaranayake, L.S. Bandara, K.A.D.N.K. Wimalawarne
Abstract:
With the popularity of the multi-core and many-core architectures there is a great requirement for software frameworks which can support parallel programming methodologies. In this paper we introduce an Eclipse toolkit, JConqurr which is easy to use and provides robust support for flexible parallel progrmaming. JConqurr is a multi-core and many-core programming toolkit for Java which is capable of providing support for common parallel programming patterns which include task, data, divide and conquer and pipeline parallelism. The toolkit uses an annotation and a directive mechanism to convert the sequential code into parallel code. In addition to that we have proposed a novel mechanism to achieve the parallelism using graphical processing units (GPU). Experiments with common parallelizable algorithms have shown that our toolkit can be easily and efficiently used to convert sequential code to parallel code and significant performance gains can be achieved.
Keywords: Multi-core, parallel programming patterns, GPU, Java, Eclipse plugin, toolkit,
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21111517 Improving the Performances of the nMPRA Architecture by Implementing Specific Functions in Hardware
Authors: Ionel Zagan, Vasile Gheorghita Gaitan
Abstract:
Minimizing the response time to asynchronous events in a real-time system is an important factor in increasing the speed of response and an interesting concept in designing equipment fast enough for the most demanding applications. The present article will present the results regarding the validation of the nMPRA (Multi Pipeline Register Architecture) architecture using the FPGA Virtex-7 circuit. The nMPRA concept is a hardware processor with the scheduler implemented at the processor level; this is done without affecting a possible bus communication, as is the case with the other CPU solutions. The implementation of static or dynamic scheduling operations in hardware and the improvement of handling interrupts and events by the real-time executive described in the present article represent a key solution for eliminating the overhead of the operating system functions. The nMPRA processor is capable of executing a preemptive scheduling, using various algorithms without a software scheduler. Therefore, we have also presented various scheduling methods and algorithms used in scheduling the real-time tasks.
Keywords: nMPRA architecture, pipeline processor, preemptive scheduling, real-time system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8801516 3-D Numerical Model for Wave-Induced Seabed Response around an Offshore Pipeline
Authors: Zuodong Liang, Dong-Sheng Jeng
Abstract:
Seabed instability around an offshore pipeline is one of key factors that need to be considered in the design of offshore infrastructures. Unlike previous investigations, a three-dimensional numerical model for the wave-induced soil response around an offshore pipeline is proposed in this paper. The numerical model was first validated with 2-D experimental data available in the literature. Then, a parametric study will be carried out to examine the effects of wave, seabed characteristics and confirmation of pipeline. Numerical examples demonstrate significant influence of wave obliquity on the wave-induced pore pressures and the resultant seabed liquefaction around the pipeline, which cannot be observed in 2-D numerical simulation.Keywords: Pore pressure, 3D wave model, seabed liquefaction, pipeline.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10421515 Using the PGAS Programming Paradigm for Biological Sequence Alignment on a Chip Multi-Threading Architecture
Authors: M. Bakhouya, S. A. Bahra, T. El-Ghazawi
Abstract:
The Partitioned Global Address Space (PGAS) programming paradigm offers ease-of-use in expressing parallelism through a global shared address space while emphasizing performance by providing locality awareness through the partitioning of this address space. Therefore, the interest in PGAS programming languages is growing and many new languages have emerged and are becoming ubiquitously available on nearly all modern parallel architectures. Recently, new parallel machines with multiple cores are designed for targeting high performance applications. Most of the efforts have gone into benchmarking but there are a few examples of real high performance applications running on multicore machines. In this paper, we present and evaluate a parallelization technique for implementing a local DNA sequence alignment algorithm using a PGAS based language, UPC (Unified Parallel C) on a chip multithreading architecture, the UltraSPARC T1.Keywords: Partitioned Global Address Space, Unified Parallel C, Multicore machines, Multi-threading Architecture, Sequence alignment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13911514 Optimization of Multicast Transmissions in NC-HMIPv6 Environment
Authors: Souleymane Oumtanaga, Kadjo Tanon Lambert, Koné Tiémoman, Tety Pierre, Kimou KouadioProsper
Abstract:
Multicast transmissions allow an host (the source) to send only one flow bound for a group of hosts (the receivers). Any equipment eager to belong to the group may explicitly register itself to that group via its multicast router. This router will be given the responsibility to convey all information relating to the group to all registered hosts. However in an environment in which the final receiver or the source frequently moves, the multicast flows need particular treatment. This constitutes one of the multicast transmissions problems around which several proposals were made in the Mobile IPv6 case in general. In this article, we describe the problems involved in this IPv6 multicast mobility and the existing proposals for their resolution. Then architecture will be proposed aiming to satisfy and optimize these transmissions in the specific case of a mobile multicast receiver in NC-HMIPv6 environment.
Keywords: Mobile IP, NC-HMIPv6, Multicast, MLD, PIM, SSM, Rendezvous Point.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15191513 Development of a Pipeline Monitoring System by Bio-mimetic Robots
Authors: Seung You Na, Daejung Shin, Jin Young Kim, Joo Hyun Jung, Yong-Gwan Won
Abstract:
To explore pipelines is one of various bio-mimetic robot applications. The robot may work in common buildings such as between ceilings and ducts, in addition to complicated and massive pipeline systems of large industrial plants. The bio-mimetic robot finds any troubled area or malfunction and then reports its data. Importantly, it can not only prepare for but also react to any abnormal routes in the pipeline. The pipeline monitoring tasks require special types of mobile robots. For an effective movement along a pipeline, the movement of the robot will be similar to that of insects or crawling animals. During its movement along the pipelines, a pipeline monitoring robot has an important task of finding the shapes of the approaching path on the pipes. In this paper we propose an effective solution to the pipeline pattern recognition, based on the fuzzy classification rules for the measured IR distance data.Keywords: Bio-mimetic robots, Plant pipes monitoring, Pipepattern recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16491512 Parallel Block Backward Differentiation Formulas For Solving Large Systems of Ordinary Differential Equations
Authors: Zarina Bibi, I., Khairil Iskandar, O.
Abstract:
In this paper, parallelism in the solution of Ordinary Differential Equations (ODEs) to increase the computational speed is studied. The focus is the development of parallel algorithm of the two point Block Backward Differentiation Formulas (PBBDF) that can take advantage of the parallel architecture in computer technology. Parallelism is obtained by using Message Passing Interface (MPI). Numerical results are given to validate the efficiency of the PBBDF implementation as compared to the sequential implementation.Keywords: Ordinary differential equations, parallel.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16691511 A Low-cost Reconfigurable Architecture for AES Algorithm
Authors: Yibo Fan, Takeshi Ikenaga, Yukiyasu Tsunoo, Satoshi Goto
Abstract:
This paper proposes a low-cost reconfigurable architecture for AES algorithm. The proposed architecture separates SubBytes and MixColumns into two parallel data path, and supports different bit-width operation for this two data path. As a result, different number of S-box can be supported in this architecture. The throughput and power consumption can be adjusted by changing the number of S-box running in this design. Using the TSMC 0.18μm CMOS standard cell library, a very low-cost implementation of 7K Gates is obtained under 182MHz frequency. The maximum throughput is 360Mbps while using 4 S-Box simultaneously, and the minimum throughput is 114Mbps while only using 1 S-BoxKeywords: AES, Reconfigurable architecture, low cost
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20671510 Recent Developments in Speed Control System of Pipeline PIGs for Deepwater Pipeline Applications
Authors: Mohamad Azmi Haniffa, Fakhruldin Mohd Hashim
Abstract:
Pipeline infrastructures normally represent high cost of investment and the pipeline must be free from risks that could cause environmental hazard and potential threats to personnel safety. Pipeline integrity such monitoring and management become very crucial to provide unimpeded transportation and avoiding unnecessary production deferment. Thus proper cleaning and inspection is the key to safe and reliable pipeline operation and plays an important role in pipeline integrity management program and has become a standard industry procedure. In view of this, understanding the motion (dynamic behavior), prediction and control of the PIG speed is important in executing pigging operation as it offers significant benefits, such as estimating PIG arrival time at receiving station, planning for suitable pigging operation, and improves efficiency of pigging tasks. The objective of this paper is to review recent developments in speed control system of pipeline PIGs. The review carried out would serve as an industrial application in a form of quick reference of recent developments in pipeline PIG speed control system, and further initiate others to add-in/update the list in the future leading to knowledge based data, and would attract active interest of others to share their view points.
Keywords: Pipeline Inspection Gauge (PIG), In Line Inspection Tools (ILI), PIG motion, PIG speed control system
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33311509 Hybrid Prefix Adder Architecture for Minimizing the Power Delay Product
Authors: P.Ramanathan, P.T.Vanathi
Abstract:
Parallel Prefix addition is a technique for improving the speed of binary addition. Due to continuing integrating intensity and the growing needs of portable devices, low-power and highperformance designs are of prime importance. The classical parallel prefix adder structures presented in the literature over the years optimize for logic depth, area, fan-out and interconnect count of logic circuits. In this paper, a new architecture for performing 8-bit, 16-bit and 32-bit Parallel Prefix addition is proposed. The proposed prefix adder structures is compared with several classical adders of same bit width in terms of power, delay and number of computational nodes. The results reveal that the proposed structures have the least power delay product when compared with its peer existing Prefix adder structures. Tanner EDA tool was used for simulating the adder designs in the TSMC 180 nm and TSMC 130 nm technologies.Keywords: Parallel Prefix Adder (PPA), Dot operator, Semi-Dotoperator, Complementary Metal Oxide Semiconductor (CMOS), Odd-dot operator, Even-dot operator, Odd-semi-dot operator andEven-semi-dot operator.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17261508 High Performance VLSI Architecture of 2D Discrete Wavelet Transform with Scalable Lattice Structure
Authors: Juyoung Kim, Taegeun Park
Abstract:
In this paper, we propose a fully-utilized, block-based 2D DWT (discrete wavelet transform) architecture, which consists of four 1D DWT filters with two-channel QMF lattice structure. The proposed architecture requires about 2MN-3N registers to save the intermediate results for higher level decomposition, where M and N stand for the filter length and the row width of the image respectively. Furthermore, the proposed 2D DWT processes in horizontal and vertical directions simultaneously without an idle period, so that it computes the DWT for an N×N image in a period of N2(1-2-2J)/3. Compared to the existing approaches, the proposed architecture shows 100% of hardware utilization and high throughput rates. To mitigate the long critical path delay due to the cascaded lattices, we can apply the pipeline technique with four stages, while retaining 100% of hardware utilization. The proposed architecture can be applied in real-time video signal processing.
Keywords: discrete wavelet transform, VLSI architecture, QMF lattice filter, pipelining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17821507 A Practical Distributed String Matching Algorithm Architecture and Implementation
Authors: Bi Kun, Gu Nai-jie, Tu Kun, Liu Xiao-hu, Liu Gang
Abstract:
Traditional parallel single string matching algorithms are always based on PRAM computation model. Those algorithms concentrate on the cost optimal design and the theoretical speed. Based on the distributed string matching algorithm proposed by CHEN, a practical distributed string matching algorithm architecture is proposed in this paper. And also an improved single string matching algorithm based on a variant Boyer-Moore algorithm is presented. We implement our algorithm on the above architecture and the experiments prove that it is really practical and efficient on distributed memory machine. Its computation complexity is O(n/p + m), where n is the length of the text, and m is the length of the pattern, and p is the number of the processors.Keywords: Boyer-Moore algorithm, distributed algorithm, parallel string matching, string matching.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2189