Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1911

Search results for: Hardware Implementation

1911 Efficient Pipelined Hardware Implementation of RIPEMD-160 Hash Function

Authors: H. E. Michail, V. N. Thanasoulis, G. A. Panagiotakopoulos, A. P. Kakarountas, C. E. Goutis

Abstract:

In this paper an efficient implementation of Ripemd- 160 hash function is presented. Hash functions are a special family of cryptographic algorithms, which is used in technological applications with requirements for security, confidentiality and validity. Applications like PKI, IPSec, DSA, MAC-s incorporate hash functions and are used widely today. The Ripemd-160 is emanated from the necessity for existence of very strong algorithms in cryptanalysis. The proposed hardware implementation can be synthesized easily for a variety of FPGA and ASIC technologies. Simulation results, using commercial tools, verified the efficiency of the implementation in terms of performance and throughput. Special care has been taken so that the proposed implementation doesn-t introduce extra design complexity; while in parallel functionality was kept to the required levels.

Keywords: Hardware implementation, hash functions, Ripemd-160, security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1653
1910 Design of a Neural Networks Classifier for Face Detection

Authors: F. Smach, M. Atri, J. Mitéran, M. Abid

Abstract:

Face detection and recognition has many applications in a variety of fields such as security system, videoconferencing and identification. Face classification is currently implemented in software. A hardware implementation allows real-time processing, but has higher cost and time to-market. The objective of this work is to implement a classifier based on neural networks MLP (Multi-layer Perceptron) for face detection. The MLP is used to classify face and non-face patterns. The systm is described using C language on a P4 (2.4 Ghz) to extract weight values. Then a Hardware implementation is achieved using VHDL based Methodology. We target Xilinx FPGA as the implementation support.

Keywords: Classification, Face Detection, FPGA Hardware description, MLP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2076
1909 An Efficient Hardware Implementation of Extended and Fast Physical Addressing in Microprocessor-Based Systems Using Programmable Logic

Authors: Mountassar Maamoun, Abdelhamid Meraghni, Abdelhalim Benbelkacem, Daoud Berkani

Abstract:

This paper describes an efficient hardware implementation of a new technique for interfacing the data exchange between the microprocessor-based systems and the external devices. This technique, based on the use of software/hardware system and a reduced physical address, enlarges the interfacing capacity of the microprocessor-based systems, uses the Direct Memory Access (DMA) to increases the frequency of the new bus, and improves the speed of data exchange. While using this architecture in microprocessor-based system or in computer, the input of the hardware part of our system will be connected to the bus system, and the output, which is a new bus, will be connected to an external device. The new bus is composed of a data bus, a control bus and an address bus. A Xilinx Integrated Software Environment (ISE) 7.1i has been used for the programmable logic implementation.

Keywords: Interfacing, Software/hardware System, CPLD, programmable logic, DMA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1123
1908 A Novel Genetic Algorithm Designed for Hardware Implementation

Authors: Zhenhuan Zhu, David Mulvaney, Vassilios Chouliaras

Abstract:

A new genetic algorithm, termed the 'optimum individual monogenetic genetic algorithm' (OIMGA), is presented whose properties have been deliberately designed to be well suited to hardware implementation. Specific design criteria were to ensure fast access to the individuals in the population, to keep the required silicon area for hardware implementation to a minimum and to incorporate flexibility in the structure for the targeting of a range of applications. The first two criteria are met by retaining only the current optimum individual, thereby guaranteeing a small memory requirement that can easily be stored in fast on-chip memory. Also, OIMGA can be easily reconfigured to allow the investigation of problems that normally warrant either large GA populations or individuals many genes in length. Local convergence is achieved in OIMGA by retaining elite individuals, while population diversity is ensured by continually searching for the best individuals in fresh regions of the search space. The results given in this paper demonstrate that both the performance of OIMGA and its convergence time are superior to those of a range of existing hardware GA implementations.

Keywords: Genetic algorithms, genetic hardware, machinelearning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1602
1907 A Pipelined FSBM Hardware Architecture for HTDV-H.26x

Authors: H. Loukil, A. Ben Atitallah, F. Ghozzi, M. A. Ben Ayed, N. Masmoudi

Abstract:

In MPEG and H.26x standards, to eliminate the temporal redundancy we use motion estimation. Given that the motion estimation stage is very complex in terms of computational effort, a hardware implementation on a re-configurable circuit is crucial for the requirements of different real time multimedia applications. In this paper, we present hardware architecture for motion estimation based on "Full Search Block Matching" (FSBM) algorithm. This architecture presents minimum latency, maximum throughput, full utilization of hardware resources such as embedded memory blocks, and combining both pipelining and parallel processing techniques. Our design is described in VHDL language, verified by simulation and implemented in a Stratix II EP2S130F1020C4 FPGA circuit. The experiment result show that the optimum operating clock frequency of the proposed design is 89MHz which achieves 160M pixels/sec.

Keywords: SAD, FSBM, Hardware Implementation, FPGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1393
1906 The Hardware Implementation of a Novel Genetic Algorithm

Authors: Zhenhuan Zhu, David Mulvaney, Vassilios Chouliaras

Abstract:

This paper presents a novel genetic algorithm, termed the Optimum Individual Monogenetic Algorithm (OIMGA) and describes its hardware implementation. As the monogenetic strategy retains only the optimum individual, the memory requirement is dramatically reduced and no crossover circuitry is needed, thereby ensuring the requisite silicon area is kept to a minimum. Consequently, depending on application requirements, OIMGA allows the investigation of solutions that warrant either larger GA populations or individuals of greater length. The results given in this paper demonstrate that both the performance of OIMGA and its convergence time are superior to those of existing hardware GA implementations. Local convergence is achieved in OIMGA by retaining elite individuals, while population diversity is ensured by continually searching for the best individuals in fresh regions of the search space.

Keywords: Genetic algorithms, hardware-based machinelearning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1400
1905 2-D Realization of WiMAX Channel Interleaver for Efficient Hardware Implementation

Authors: Rizwan Asghar, Dake Liu

Abstract:

The direct implementation of interleaver functions in WiMAX is not hardware efficient due to presence of complex functions. Also the conventional method i.e. using memories for storing the permutation tables is silicon consuming. This work presents a 2-D transformation for WiMAX channel interleaver functions which reduces the overall hardware complexity to compute the interleaver addresses on the fly. A fully reconfigurable architecture for address generation in WiMAX channel interleaver is presented, which consume 1.1 k-gates in total. It can be configured for any block size and any modulation scheme in WiMAX. The presented architecture can run at a frequency of 200 MHz, thus fully supporting high bandwidth requirements for WiMAX.

Keywords: Interleaver, deinterleaver, WiMAX, 802.16e.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2006
1904 Hardware Implementations for the ISO/IEC 18033-4:2005 Standard for Stream Ciphers

Authors: Paris Kitsos

Abstract:

In this paper the FPGA implementations for four stream ciphers are presented. The two stream ciphers, MUGI and SNOW 2.0 are recently adopted by the International Organization for Standardization ISO/IEC 18033-4:2005 standard. The other two stream ciphers, MICKEY 128 and TRIVIUM have been submitted and are under consideration for the eSTREAM, the ECRYPT (European Network of Excellence for Cryptology) Stream Cipher project. All ciphers were coded using VHDL language. For the hardware implementation, an FPGA device was used. The proposed implementations achieve throughputs range from 166 Mbps for MICKEY 128 to 6080 Mbps for MUGI.

Keywords: Cryptography, ISO/IEC 18033-4:2005 standard, Hardware implementation, Stream ciphers

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1534
1903 FPGA Implementation of RSA Cryptosystem

Authors: Ridha Ghayoula, ElAmjed Hajlaoui, Talel Korkobi, Mbarek Traii, Hichem Trabelsi

Abstract:

In this paper, the hardware implementation of the RSA public-key cryptographic algorithm is presented. The RSA cryptographic algorithm is depends on the computation of repeated modular exponentials. The Montgomery algorithm is used and modified to reduce hardware resources and to achieve reasonable operating speed for FPGA. An efficient architecture for modular multiplications based on the array multiplier is proposed. We have implemented a RSA cryptosystem based on Montgomery algorithm. As a result, it is shown that proposed architecture contributes to small area and reasonable speed.

Keywords: RSA, Cryptosystem, Montgomery, Implementation.FPGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2170
1902 An FPGA Implementation of Intelligent Visual Based Fall Detection

Authors: Peng Shen Ong, Yoong Choon Chang, Chee Pun Ooi, Ettikan K. Karuppiah, Shahirina Mohd Tahir

Abstract:

Falling has been one of the major concerns and threats to the independence of the elderly in their daily lives. With the worldwide significant growth of the aging population, it is essential to have a promising solution of fall detection which is able to operate at high accuracy in real-time and supports large scale implementation using multiple cameras. Field Programmable Gate Array (FPGA) is a highly promising tool to be used as a hardware accelerator in many emerging embedded vision based system. Thus, it is the main objective of this paper to present an FPGA-based solution of visual based fall detection to meet stringent real-time requirements with high accuracy. The hardware architecture of visual based fall detection which utilizes the pixel locality to reduce memory accesses is proposed. By exploiting the parallel and pipeline architecture of FPGA, our hardware implementation of visual based fall detection using FGPA is able to achieve a performance of 60fps for a series of video analytical functions at VGA resolutions (640x480). The results of this work show that FPGA has great potentials and impacts in enabling large scale vision system in the future healthcare industry due to its flexibility and scalability.

Keywords: Fall detection, FPGA, hardware implementation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2236
1901 Hardware Implementation of Local Binary Pattern Based Two-Bit Transform Motion Estimation

Authors: Seda Yavuz, Anıl Çelebi, Aysun Taşyapı Çelebi, Oğuzhan Urhan

Abstract:

Nowadays, demand for using real-time video transmission capable devices is ever-increasing. So, high resolution videos have made efficient video compression techniques an essential component for capturing and transmitting video data. Motion estimation has a critical role in encoding raw video. Hence, various motion estimation methods are introduced to efficiently compress the video. Low bit‑depth representation based motion estimation methods facilitate computation of matching criteria and thus, provide small hardware footprint. In this paper, a hardware implementation of a two-bit transformation based low-complexity motion estimation method using local binary pattern approach is proposed. Image frames are represented in two-bit depth instead of full-depth by making use of the local binary pattern as a binarization approach and the binarization part of the hardware architecture is explained in detail. Experimental results demonstrate the difference between the proposed hardware architecture and the architectures of well-known low-complexity motion estimation methods in terms of important aspects such as resource utilization, energy and power consumption.

Keywords: Binarization, hardware architecture, local binary pattern, motion estimation, two-bit transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 810
1900 Digital Filter for Cochlear Implant Implemented on a Field- Programmable Gate Array

Authors: Rekha V. Dundur , M.V.Latte, S.Y. Kulkarni, M.K.Venkatesha

Abstract:

The advent of multi-million gate Field Programmable Gate Arrays (FPGAs) with hardware support for multiplication opens an opportunity to recreate a significant portion of the front end of a human cochlea using this technology. In this paper we describe the implementation of the cochlear filter and show that it is entirely suited to a single device XC3S500 FPGA implementation .The filter gave a good fit to real time data with efficiency of hardware usage.

Keywords: Cochlea, FPGA, IIR (Infinite Impulse Response), Multiplier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2055
1899 Cellular Automata Based Robust Watermarking Architecture towards the VLSI Realization

Authors: V. H. Mankar, T. S. Das, S. K. Sarkar

Abstract:

In this paper, we have proposed a novel blind watermarking architecture towards its hardware implementation in VLSI. In order to facilitate this hardware realization, cellular automata (CA) concept is introduced. The CA has been already accepted as an attractive structure for VLSI implementation because of its modularity, parallelism, high performance and reliability. The hardware realizable multiresolution spread spectrum watermarking techniques are very few in numbers in spite of their best ever resiliency against signal impairments. This is because of the computational cost and complexity associated with their different filter banks and lifting techniques. The concept of cellular automata theory in order to form a new transform domain technique i.e. Cellular Automata Transform (CAT) have been incorporated. Since CA provides spreading sequences having very low cross-correlation properties, the CA based pseudorandom sequence generator is considered in the present work. Considering the watermarking technique as a digital communication process, an error control coding (ECC) must be incorporated in the data hiding schemes. Besides the hardware implementation of entire CA based data hiding technique, the individual blocks of the algorithm using CA provide the best result than that of some other methods irrespective of the hardware and software technique. The Cellular Automata Transform, CA based PN sequence generator, and CA ECC are the requisite blocks that are developed not only to meet the reliable hardware requirements but also for the basic spread spectrum watermarking features. The proposed algorithm shows statistical invisibility and resiliency against various common signal-processing operations. This algorithmic design utilizes the existing allocated bandwidth in the data transmission channel in a more efficient manner.

Keywords: Cellular automata, watermarking, error control coding, PN sequence, VLSI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1819
1898 Multi-board Run-time Reconfigurable Implementation of Intrinsic Evolvable Hardware

Authors: Cyrille Lambert, Tatiana Kalganova, Emanuele Stomeo, Manissa Wilson

Abstract:

A multi-board run-time reconfigurable (MRTR) system for evolvable hardware (EHW) is introduced with the aim to implement on hardware the bidirectional incremental evolution (BIE) method. The main features of this digital intrinsic EHW solution rely on the multi-board approach, the variable chromosome length management and the partial configuration of the reconfigurable circuit. These three features provide a high scalability to the solution. The design has been written in VHDL with the concern of not being platform dependant in order to keep a flexibility factor as high as possible. This solution helps tackling the problem of evolving complex task on digital configurable support.

Keywords: Evolvable Hardware, Evolutionary Strategy, multiboardFPGA system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1304
1897 High Level Synthesis of Canny Edge Detection Algorithm on Zynq Platform

Authors: Hanaa M. Abdelgawad, Mona Safar, Ayman M. Wahba

Abstract:

Real time image and video processing is a demand in many computer vision applications, e.g. video surveillance, traffic management and medical imaging. The processing of those video applications requires high computational power. Thus, the optimal solution is the collaboration of CPU and hardware accelerators. In this paper, a Canny edge detection hardware accelerator is proposed. Edge detection is one of the basic building blocks of video and image processing applications. It is a common block in the pre-processing phase of image and video processing pipeline. Our presented approach targets offloading the Canny edge detection algorithm from processing system (PS) to programmable logic (PL) taking the advantage of High Level Synthesis (HLS) tool flow to accelerate the implementation on Zynq platform. The resulting implementation enables up to a 100x performance improvement through hardware acceleration. The CPU utilization drops down and the frame rate jumps to 60 fps of 1080p full HD input video stream.

Keywords: High Level Synthesis, Canny edge detection, Hardware accelerators, and Computer Vision.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5139
1896 Adaptive Multiple Transforms Hardware Architecture for Versatile Video Coding

Authors: T. Damak, S. Houidi, M. A. Ben Ayed, N. Masmoudi

Abstract:

The Versatile Video Coding standard (VVC) is actually under development by the Joint Video Exploration Team (or JVET). An Adaptive Multiple Transforms (AMT) approach was announced. It is based on different transform modules that provided an efficient coding. However, the AMT solution raises several issues especially regarding the complexity of the selected set of transforms. This can be an important issue, particularly for a future industrial adoption. This paper proposed an efficient hardware implementation of the most used transform in AMT approach: the DCT II. The developed circuit is adapted to different block sizes and can reach a minimum frequency of 192 MHz allowing an optimized execution time.

Keywords: AMT, DCT II, hardware, transform, VVC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 280
1895 FPGA Based Parallel Architecture for the Computation of Third-Order Cross Moments

Authors: Syed Manzoor Qasim, Shuja Abbasi, Saleh Alshebeili, Bandar Almashary, Ateeq Ahmad Khan

Abstract:

Higher-order Statistics (HOS), also known as cumulants, cross moments and their frequency domain counterparts, known as poly spectra have emerged as a powerful signal processing tool for the synthesis and analysis of signals and systems. Algorithms used for the computation of cross moments are computationally intensive and require high computational speed for real-time applications. For efficiency and high speed, it is often advantageous to realize computation intensive algorithms in hardware. A promising solution that combines high flexibility together with the speed of a traditional hardware is Field Programmable Gate Array (FPGA). In this paper, we present FPGA-based parallel architecture for the computation of third-order cross moments. The proposed design is coded in Very High Speed Integrated Circuit (VHSIC) Hardware Description Language (VHDL) and functionally verified by implementing it on Xilinx Spartan-3 XC3S2000FG900-4 FPGA. Implementation results are presented and it shows that the proposed design can operate at a maximum frequency of 86.618 MHz.

Keywords: Cross moments, Cumulants, FPGA, Hardware Implementation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1442
1894 Neural Network Implementation Using FPGA: Issues and Application

Authors: A. Muthuramalingam, S. Himavathi, E. Srinivasan

Abstract:

.Hardware realization of a Neural Network (NN), to a large extent depends on the efficient implementation of a single neuron. FPGA-based reconfigurable computing architectures are suitable for hardware implementation of neural networks. FPGA realization of ANNs with a large number of neurons is still a challenging task. This paper discusses the issues involved in implementation of a multi-input neuron with linear/nonlinear excitation functions using FPGA. Implementation method with resource/speed tradeoff is proposed to handle signed decimal numbers. The VHDL coding developed is tested using Xilinx XC V50hq240 Chip. To improve the speed of operation a lookup table method is used. The problems involved in using a lookup table (LUT) for a nonlinear function is discussed. The percentage saving in resource and the improvement in speed with an LUT for a neuron is reported. An attempt is also made to derive a generalized formula for a multi-input neuron that facilitates to estimate approximately the total resource requirement and speed achievable for a given multilayer neural network. This facilitates the designer to choose the FPGA capacity for a given application. Using the proposed method of implementation a neural network based application, namely, a Space vector modulator for a vector-controlled drive is presented

Keywords: FPGA implementation, multi-input neuron, neural network, nn based space vector modulator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3951
1893 Fully Parameterizable FPGA based Crypto-Accelerator

Authors: Iqbalur Rahman, Miftahur Rahman, Abul L Haque, Mostafizur Rahman,

Abstract:

In this paper, RSA encryption algorithm and its hardware implementation in Xilinx-s Virtex Field Programmable Gate Arrays (FPGA) is analyzed. The issues of scalability, flexible performance, and silicon efficiency for the hardware acceleration of public key crypto systems are being explored in the present work. Using techniques based on the interleaved math for exponentiation, the proposed RSA calculation architecture is compared to existing FPGA-based solutions for speed, FPGA utilization, and scalability. The paper covers the RSA encryption algorithm, interleaved multiplication, Miller Rabin algorithm for primality test, extended Euclidean math, basic FPGA technology, and the implementation details of the proposed RSA calculation architecture. Performance of several alternative hardware architectures is discussed and compared. Finally, conclusion is drawn, highlighting the advantages of a fully flexible & parameterized design.

Keywords: Crypto Accelerator, FPGA, Public Key Cryptography, RSA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2465
1892 Implementation of Adder-Subtracter Design with VerilogHDL

Authors: May Phyo Thwal, Khin Htay Kyi, Kyaw Swar Soe

Abstract:

According to the density of the chips, designers are trying to put so any facilities of computational and storage on single chips. Along with the complexity of computational and storage circuits, the designing, testing and debugging become more and more complex and expensive. So, hardware design will be built by using very high speed hardware description language, which is more efficient and cost effective. This paper will focus on the implementation of 32-bit ALU design based on Verilog hardware description language. Adder and subtracter operate correctly on both unsigned and positive numbers. In ALU, addition takes most of the time if it uses the ripple-carry adder. The general strategy for designing fast adders is to reduce the time required to form carry signals. Adders that use this principle are called carry look- ahead adder. The carry look-ahead adder is to be designed with combination of 4-bit adders. The syntax of Verilog HDL is similar to the C programming language. This paper proposes a unified approach to ALU design in which both simulation and formal verification can co-exist.

Keywords: Addition, arithmetic logic unit, carry look-ahead adder, Verilog HDL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8684
1891 Development of A Meta Description Language for Software/Hardware Cooperative Design and Verification for Model-Checking Systems

Authors: Katsumi Wasaki, Naoki Iwasaki

Abstract:

Model-checking tools such as Symbolic Model Verifier (SMV) and NuSMV are available for checking hardware designs. These tools can automatically check the formal legitimacy of a design. However, NuSMV is too low level for describing a complete hardware design. It is therefore necessary to translate the system definition, as designed in a language such as Verilog or VHDL, into a language such as NuSMV for validation. In this paper, we present a meta hardware description language, Melasy, that contains a code generator for existing hardware description languages (HDLs) and languages for model checking that solve this problem.

Keywords: meta description language, software/hardware codesign, co-verification, formal verification, hardware compiler, modelchecking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1238
1890 Hardware Approach to Solving Password Exposure Problem through Keyboard Sniff

Authors: Kyungroul Lee, Kwangjin Bae, Kangbin Yim

Abstract:

This paper introduces a hardware solution to password exposure problem caused by direct accesses to the keyboard hardware interfaces through which a possible attacker is able to grab user-s password even where existing countermeasures are deployed. Several researches have proposed reasonable software based solutions to the problem for years. However, recently introduced hardware vulnerability problems have neutralized the software approaches and yet proposed any effective software solution to the vulnerability. Hardware approach in this paper is expected as the only solution to the vulnerability

Keywords: Keyboard sniff, password exposure, hardware vulnerability, privacy problem, insider security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1330
1889 Run-Time Customisation of Soft-Core CPUs on Field Programmable Gate Array

Authors: Rehab Abdullah Shendi

Abstract:

The use of customised soft-core processors in which instructions can be integrated into a system in application hardware is increasing in the Field Programmable Gate Array (FPGA) field. Specifically, the partial run-time reconfiguration of FPGAs in specialised processors for a particular domain can be very beneficial. In this report, the design and implementation for the customisation of a soft-core MIPS processor using an FPGA and partial reconfiguration (PR) of FPGA technology will be addressed to achieve efficient resource use. This can be achieved using a PR design flow that helps the design fit into a smaller device. Moreover, the impact of static power consumption could be reduced due to runtime reconfiguration. This will be done by configurable custom instructions implemented in the hardware as an extension on the MIPS CPU. The aim of this project is to investigate the PR of FPGAs for run-time adaptations of the instruction set of a soft-core CPU, including the integration of custom instructions and the exploration of the potential to use the MultiBoot feature available in Xilinx FPGAs to carry out the PR process. The system will be evaluated and tested on a Nexus 3 development board featuring a Xilinx Spartran-6 FPGA. The system will be able to load reconfigurable custom instructions dynamically into user programs with the help of the trap handler when the custom instruction is called by the MIPS CPU. The results of this experiment demonstrate that custom instructions in hardware can speed up a certain function and many instructions can be saved when compared to a software implementation of the same function. Implementing custom instructions in hardware is perfectly possible and worth exploring.

Keywords: Customisation, FPGA, MIPS, partial reconfiguration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 901
1888 FPGA Implementation of Generalized Maximal Ratio Combining Receiver Diversity

Authors: Rafic Ayoubi, Jean-Pierre Dubois, Rania Minkara

Abstract:

In this paper, we study FPGA implementation of a novel supra-optimal receiver diversity combining technique, generalized maximal ratio combining (GMRC), for wireless transmission over fading channels in SIMO systems. Prior published results using ML-detected GMRC diversity signal driven by BPSK showed superior bit error rate performance to the widely used MRC combining scheme in an imperfect channel estimation (ICE) environment. Under perfect channel estimation conditions, the performance of GMRC and MRC were identical. The main drawback of the GMRC study was that it was theoretical, thus successful FPGA implementation of it using pipeline techniques is needed as a wireless communication test-bed for practical real-life situations. Simulation results showed that the hardware implementation was efficient both in terms of speed and area. Since diversity combining is especially effective in small femto- and picocells, internet-associated wireless peripheral systems are to benefit most from GMRC. As a result, many spinoff applications can be made to the hardware of IP-based 4th generation networks.

Keywords: Femto-internet cells, field-programmable gate array, generalized maximal-ratio combining, Lyapunov fractal dimension, pipelining technique, wireless SIMO channels.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2353
1887 CPU Architecture Based on Static Hardware Scheduler Engine and Multiple Pipeline Registers

Authors: Ionel Zagan, Vasile Gheorghita Gaitan

Abstract:

The development of CPUs and of real-time systems based on them made it possible to use time at increasingly low resolutions. Together with the scheduling methods and algorithms, time organizing has been improved so as to respond positively to the need for optimization and to the way in which the CPU is used. This presentation contains both a detailed theoretical description and the results obtained from research on improving the performances of the nMPRA (Multi Pipeline Register Architecture) processor by implementing specific functions in hardware. The proposed CPU architecture has been developed, simulated and validated by using the FPGA Virtex-7 circuit, via a SoC project. Although the nMPRA processor hardware structure with five pipeline stages is very complex, the present paper presents and analyzes the tests dedicated to the implementation of the CPU and of the memory on-chip for instructions and data. In order to practically implement and test the entire SoC project, various tests have been performed. These tests have been performed in order to verify the drivers for peripherals and the boot module named Bootloader.

Keywords: Hardware scheduler, nMPRA processor, real-time systems, scheduling methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 654
1886 An Embedded System for Artificial Intelligence Applications

Authors: Ioannis P. Panagopoulos, Christos C. Pavlatos, George K. Papakonstantinou

Abstract:

Conventional approaches in the implementation of logic programming applications on embedded systems are solely of software nature. As a consequence, a compiler is needed that transforms the initial declarative logic program to its equivalent procedural one, to be programmed to the microprocessor. This approach increases the complexity of the final implementation and reduces the overall system's performance. On the contrary, presenting hardware implementations which are only capable of supporting logic programs prevents their use in applications where logic programs need to be intertwined with traditional procedural ones, for a specific application. We exploit HW/SW codesign methods to present a microprocessor, capable of supporting hybrid applications using both programming approaches. We take advantage of the close relationship between attribute grammar (AG) evaluation and knowledge engineering methods to present a programmable hardware parser that performs logic derivations and combine it with an extension of a conventional RISC microprocessor that performs the unification process to report the success or failure of those derivations. The extended RISC microprocessor is still capable of executing conventional procedural programs, thus hybrid applications can be implemented. The presented implementation is programmable, supports the execution of hybrid applications, increases the performance of logic derivations (experimental analysis yields an approximate 1000% increase in performance) and reduces the complexity of the final implemented code. The proposed hardware design is supported by a proposed extended C-language called C-AG.

Keywords: Attribute Grammars, Logic Programming, RISC microprocessor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4854
1885 Hardware Implementation of Stack-Based Replacement Algorithms

Authors: Hassan Ghasemzadeh, Sepideh Mazrouee, Hassan Goldani Moghaddam, Hamid Shojaei, Mohammad Reza Kakoee

Abstract:

Block replacement algorithms to increase hit ratio have been extensively used in cache memory management. Among basic replacement schemes, LRU and FIFO have been shown to be effective replacement algorithms in terms of hit rates. In this paper, we introduce a flexible stack-based circuit which can be employed in hardware implementation of both LRU and FIFO policies. We propose a simple and efficient architecture such that stack-based replacement algorithms can be implemented without the drawbacks of the traditional architectures. The stack is modular and hence, a set of stack rows can be cascaded depending on the number of blocks in each cache set. Our circuit can be implemented in conjunction with the cache controller and static/dynamic memories to form a cache system. Experimental results exhibit that our proposed circuit provides an average value of 26% improvement in storage bits and its maximum operating frequency is increased by a factor of two

Keywords: Cache Memory, Replacement Algorithms, LeastRecently Used Algorithm, First In First Out Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3149
1884 Low Complexity Multi Mode Interleaver Core for WiMAX with Support for Convolutional Interleaving

Authors: Rizwan Asghar, Dake Liu

Abstract:

A hardware efficient, multi mode, re-configurable architecture of interleaver/de-interleaver for multiple standards, like DVB, WiMAX and WLAN is presented. The interleavers consume a large part of silicon area when implemented by using conventional methods as they use memories to store permutation patterns. In addition, different types of interleavers in different standards cannot share the hardware due to different construction methodologies. The novelty of the work presented in this paper is threefold: 1) Mapping of vital types of interleavers including convolutional interleaver onto a single architecture with flexibility to change interleaver size; 2) Hardware complexity for channel interleaving in WiMAX is reduced by using 2-D realization of the interleaver functions; and 3) Silicon cost overheads reduced by avoiding the use of small memories. The proposed architecture consumes 0.18mm2 silicon area for 0.12μm process and can operate at a frequency of 140 MHz. The reduced complexity helps in minimizing the memory utilization, and at the same time provides strong support to on-the-fly computation of permutation patterns.

Keywords: Hardware interleaver implementation, WiMAX, DVB, block interleaver, convolutional interleaver, hardwaremultiplexing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1776
1883 Floating-Point Scaling for BSS Gain Control

Authors: Abdelmalek Fermas, Adel Belouchrani, Otmane Ait Mohamed

Abstract:

In Blind Source Separation (BSS) processing, taking advantage of scaling factor indetermination and based on the floatingpoint representation, we propose a scaling technique applied to the separation matrix, to avoid the saturation or the weakness in the recovered source signals. This technique performs an Automatic Gain Control (AGC) in an on-line BSS environment. We demonstrate the effectiveness of this technique by using the implementation of a division free BSS algorithm with two input, two output. This technique is computationally cheaper and efficient for a hardware implementation.

Keywords: Automatic Gain Control, Blind Source Separation, Floating-Point Representation, FPGA Implementation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1258
1882 FPGA Hardware Implementation and Evaluation of a Micro-Network Architecture for Multi-Core Systems

Authors: Yahia Salah, Med Lassaad Kaddachi, Rached Tourki

Abstract:

This paper presents the design, implementation and evaluation of a micro-network, or Network-on-Chip (NoC), based on a generic pipeline router architecture. The router is designed to efficiently support traffic generated by multimedia applications on embedded multi-core systems. It employs a simplest routing mechanism and implements the round-robin scheduling strategy to resolve output port contentions and minimize latency. A virtual channel flow control is applied to avoid the head-of-line blocking problem and enhance performance in the NoC. The hardware design of the router architecture has been implemented at the register transfer level; its functionality is evaluated in the case of the two dimensional Mesh/Torus topology, and performance results are derived from ModelSim simulator and Xilinx ISE 9.2i synthesis tool. An example of a multi-core image processing system utilizing the NoC structure has been implemented and validated to demonstrate the capability of the proposed micro-network architecture. To reduce complexity of the image compression and decompression architecture, the system use image processing algorithm based on classical discrete cosine transform with an efficient zonal processing approach. The experimental results have confirmed that both the proposed image compression scheme and NoC architecture can achieve a reasonable image quality with lower processing time.

Keywords: Generic Pipeline Network-on-Chip Router Architecture, JPEG Image Compression, FPGA Hardware Implementation, Performance Evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2834