Search results for: FPGA Hardware description

771 Efficient Hardware Realization of Truncated Multipliers using FPGA

Abstract:

Truncated multiplier is a good candidate for digital signal processing (DSP) applications including finite impulse response (FIR) and discrete cosine transform (DCT). Through truncated multiplier a significant reduction in Field Programmable Gate Array (FPGA) resources can be achieved. This paper presents for the first time a comparison of resource utilization of Spartan-3AN and Virtex-5 implementation of standard and truncated multipliers using Very High Speed Integrated Circuit Hardware Description Language (VHDL). The Virtex-5 FPGA shows significant improvement as compared to Spartan-3AN FPGA device. The Virtex-5 FPGA device shows better performance with a percentage ratio of number of occupied slices for standard to truncated multipliers is increased from 40% to 73.86% as compared to Spartan- 3AN is decreased from 68.75% to 58.78%. Results show that the anomaly in Spartan-3AN FPGA device average connection and maximum pin delay have been efficiently reduced in Virtex-5 FPGA device.

Keywords: Digital Signal Processing (DSP), FieldProgrammable Gate Array (FPGA), Spartan-3AN, TruncatedMultiplier, Virtex-5, VHDL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2519

770 FPGA Based Parallel Architecture for the Computation of Third-Order Cross Moments

Authors: Syed Manzoor Qasim, Shuja Abbasi, Saleh Alshebeili, Bandar Almashary, Ateeq Ahmad Khan

Abstract:

Higher-order Statistics (HOS), also known as cumulants, cross moments and their frequency domain counterparts, known as poly spectra have emerged as a powerful signal processing tool for the synthesis and analysis of signals and systems. Algorithms used for the computation of cross moments are computationally intensive and require high computational speed for real-time applications. For efficiency and high speed, it is often advantageous to realize computation intensive algorithms in hardware. A promising solution that combines high flexibility together with the speed of a traditional hardware is Field Programmable Gate Array (FPGA). In this paper, we present FPGA-based parallel architecture for the computation of third-order cross moments. The proposed design is coded in Very High Speed Integrated Circuit (VHSIC) Hardware Description Language (VHDL) and functionally verified by implementing it on Xilinx Spartan-3 XC3S2000FG900-4 FPGA. Implementation results are presented and it shows that the proposed design can operate at a maximum frequency of 86.618 MHz.

Keywords: Cross moments, Cumulants, FPGA, Hardware Implementation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1683

769 Design of a Neural Networks Classifier for Face Detection

Authors: F. Smach, M. Atri, J. Mitéran, M. Abid

Abstract:

Face detection and recognition has many applications in a variety of fields such as security system, videoconferencing and identification. Face classification is currently implemented in software. A hardware implementation allows real-time processing, but has higher cost and time to-market. The objective of this work is to implement a classifier based on neural networks MLP (Multi-layer Perceptron) for face detection. The MLP is used to classify face and non-face patterns. The systm is described using C language on a P4 (2.4 Ghz) to extract weight values. Then a Hardware implementation is achieved using VHDL based Methodology. We target Xilinx FPGA as the implementation support.

Keywords: Classification, Face Detection, FPGA Hardware description, MLP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2233

768 Development of A Meta Description Language for Software/Hardware Cooperative Design and Verification for Model-Checking Systems

Authors: Katsumi Wasaki, Naoki Iwasaki

Abstract:

Model-checking tools such as Symbolic Model Verifier (SMV) and NuSMV are available for checking hardware designs. These tools can automatically check the formal legitimacy of a design. However, NuSMV is too low level for describing a complete hardware design. It is therefore necessary to translate the system definition, as designed in a language such as Verilog or VHDL, into a language such as NuSMV for validation. In this paper, we present a meta hardware description language, Melasy, that contains a code generator for existing hardware description languages (HDLs) and languages for model checking that solve this problem.

Keywords: meta description language, software/hardware codesign, co-verification, formal verification, hardware compiler, modelchecking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1421

767 Fully Parameterizable FPGA based Crypto-Accelerator

Authors: Iqbalur Rahman, Miftahur Rahman, Abul L Haque, Mostafizur Rahman,

Abstract:

In this paper, RSA encryption algorithm and its hardware implementation in Xilinx-s Virtex Field Programmable Gate Arrays (FPGA) is analyzed. The issues of scalability, flexible performance, and silicon efficiency for the hardware acceleration of public key crypto systems are being explored in the present work. Using techniques based on the interleaved math for exponentiation, the proposed RSA calculation architecture is compared to existing FPGA-based solutions for speed, FPGA utilization, and scalability. The paper covers the RSA encryption algorithm, interleaved multiplication, Miller Rabin algorithm for primality test, extended Euclidean math, basic FPGA technology, and the implementation details of the proposed RSA calculation architecture. Performance of several alternative hardware architectures is discussed and compared. Finally, conclusion is drawn, highlighting the advantages of a fully flexible & parameterized design.

Keywords: Crypto Accelerator, FPGA, Public Key Cryptography, RSA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2709

766 An FPGA Implementation of Intelligent Visual Based Fall Detection

Authors: Peng Shen Ong, Yoong Choon Chang, Chee Pun Ooi, Ettikan K. Karuppiah, Shahirina Mohd Tahir

Abstract:

Falling has been one of the major concerns and threats to the independence of the elderly in their daily lives. With the worldwide significant growth of the aging population, it is essential to have a promising solution of fall detection which is able to operate at high accuracy in real-time and supports large scale implementation using multiple cameras. Field Programmable Gate Array (FPGA) is a highly promising tool to be used as a hardware accelerator in many emerging embedded vision based system. Thus, it is the main objective of this paper to present an FPGA-based solution of visual based fall detection to meet stringent real-time requirements with high accuracy. The hardware architecture of visual based fall detection which utilizes the pixel locality to reduce memory accesses is proposed. By exploiting the parallel and pipeline architecture of FPGA, our hardware implementation of visual based fall detection using FGPA is able to achieve a performance of 60fps for a series of video analytical functions at VGA resolutions (640x480). The results of this work show that FPGA has great potentials and impacts in enabling large scale vision system in the future healthcare industry due to its flexibility and scalability.

Keywords: Fall detection, FPGA, hardware implementation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2419

765 FPGA Implementation of RSA Cryptosystem

Authors: Ridha Ghayoula, ElAmjed Hajlaoui, Talel Korkobi, Mbarek Traii, Hichem Trabelsi

Abstract:

In this paper, the hardware implementation of the RSA public-key cryptographic algorithm is presented. The RSA cryptographic algorithm is depends on the computation of repeated modular exponentials. The Montgomery algorithm is used and modified to reduce hardware resources and to achieve reasonable operating speed for FPGA. An efficient architecture for modular multiplications based on the array multiplier is proposed. We have implemented a RSA cryptosystem based on Montgomery algorithm. As a result, it is shown that proposed architecture contributes to small area and reasonable speed.

Keywords: RSA, Cryptosystem, Montgomery, Implementation.FPGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2368

764 Experimental Investigation of Indirect Field Oriented Control of Field Programmable Gate Array Based Five-Phase Induction Motor Drive

Authors: G. Renuka Devi

Abstract:

This paper analyzes the experimental investigation of indirect field oriented control of Field Programmable Gate Array (FPGA) based five-phase induction motor drive. A detailed d-q modeling and Space Vector Pulse Width Modulation (SVPWM) technique of 5-phase drive is elaborated in this paper. In the proposed work, the prototype model of 1 hp 5-phase Voltage Source Inverter (VSI) fed drive is implemented in hardware. SVPWM pulses are generated in FPGA platform through Very High Speed Integrated Circuit Hardware Description Language (VHDL) coding. The experimental results are observed under different loading conditions and compared with simulation results to validate the simulation model.

Keywords: Five-phase induction motor drive, field programmable gate array, indirect field oriented control, multi-phase, space vector pulse width modulation, voltage source inverter, very high speed integrated circuit hardware description language.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1254

763 A Pipelined FSBM Hardware Architecture for HTDV-H.26x

Authors: H. Loukil, A. Ben Atitallah, F. Ghozzi, M. A. Ben Ayed, N. Masmoudi

Abstract:

In MPEG and H.26x standards, to eliminate the temporal redundancy we use motion estimation. Given that the motion estimation stage is very complex in terms of computational effort, a hardware implementation on a re-configurable circuit is crucial for the requirements of different real time multimedia applications. In this paper, we present hardware architecture for motion estimation based on "Full Search Block Matching" (FSBM) algorithm. This architecture presents minimum latency, maximum throughput, full utilization of hardware resources such as embedded memory blocks, and combining both pipelining and parallel processing techniques. Our design is described in VHDL language, verified by simulation and implemented in a Stratix II EP2S130F1020C4 FPGA circuit. The experiment result show that the optimum operating clock frequency of the proposed design is 89MHz which achieves 160M pixels/sec.

Keywords: SAD, FSBM, Hardware Implementation, FPGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1599

762 Hardware Implementations for the ISO/IEC 18033-4:2005 Standard for Stream Ciphers

Authors: Paris Kitsos

Abstract:

In this paper the FPGA implementations for four stream ciphers are presented. The two stream ciphers, MUGI and SNOW 2.0 are recently adopted by the International Organization for Standardization ISO/IEC 18033-4:2005 standard. The other two stream ciphers, MICKEY 128 and TRIVIUM have been submitted and are under consideration for the eSTREAM, the ECRYPT (European Network of Excellence for Cryptology) Stream Cipher project. All ciphers were coded using VHDL language. For the hardware implementation, an FPGA device was used. The proposed implementations achieve throughputs range from 166 Mbps for MICKEY 128 to 6080 Mbps for MUGI.

Keywords: Cryptography, ISO/IEC 18033-4:2005 standard, Hardware implementation, Stream ciphers

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1749

761 Efficient Hardware Architecture of the Direct 2- D Transform for the HEVC Standard

Authors: Fatma Belghith, Hassen Loukil, Nouri Masmoudi

Abstract:

This paper presents the hardware design of a unified architecture to compute the 4x4, 8x8 and 16x16 efficient twodimensional (2-D) transform for the HEVC standard. This architecture is based on fast integer transform algorithms. It is designed only with adders and shifts in order to reduce the hardware cost significantly. The goal is to ensure the maximum circuit reuse during the computing while saving 40% for the number of operations. The architecture is developed using FIFOs to compute the second dimension. The proposed hardware was implemented in VHDL. The VHDL RTL code works at 240 MHZ in an Altera Stratix III FPGA. The number of cycles in this architecture varies from 33 in 4-point- 2D-DCT to 172 when the 16-point-2D-DCT is computed. Results show frequency improvements reaching 96% when compared to an architecture described as the direct transcription of the algorithm.

Keywords: HEVC, Modified Integer Transform, FPGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2698

760 Design of Local Interconnect Network Controller for Automotive Applications

Authors: Jong-Bae Lee, Seongsoo Lee

Abstract:

Local interconnect network (LIN) is a communication protocol that combines sensors, actuators, and processors to a functional module in automotive applications. In this paper, a LIN ver. 2.2A controller was designed in Verilog hardware description language (Verilog HDL) and implemented in field-programmable gate array (FPGA). Its operation was verified by making full-scale LIN network with the presented FPGA-implemented LIN controller, commercial LIN transceivers, and commercial processors. When described in Verilog HDL and synthesized in 0.18 μm technology, its gate size was about 2,300 gates.

Keywords: Local interconnect network, controller, transceiver, processor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1520

759 Run-Time Customisation of Soft-Core CPUs on Field Programmable Gate Array

Authors: Rehab Abdullah Shendi

Abstract:

The use of customised soft-core processors in which instructions can be integrated into a system in application hardware is increasing in the Field Programmable Gate Array (FPGA) field. Specifically, the partial run-time reconfiguration of FPGAs in specialised processors for a particular domain can be very beneficial. In this report, the design and implementation for the customisation of a soft-core MIPS processor using an FPGA and partial reconfiguration (PR) of FPGA technology will be addressed to achieve efficient resource use. This can be achieved using a PR design flow that helps the design fit into a smaller device. Moreover, the impact of static power consumption could be reduced due to runtime reconfiguration. This will be done by configurable custom instructions implemented in the hardware as an extension on the MIPS CPU. The aim of this project is to investigate the PR of FPGAs for run-time adaptations of the instruction set of a soft-core CPU, including the integration of custom instructions and the exploration of the potential to use the MultiBoot feature available in Xilinx FPGAs to carry out the PR process. The system will be evaluated and tested on a Nexus 3 development board featuring a Xilinx Spartran-6 FPGA. The system will be able to load reconfigurable custom instructions dynamically into user programs with the help of the trap handler when the custom instruction is called by the MIPS CPU. The results of this experiment demonstrate that custom instructions in hardware can speed up a certain function and many instructions can be saved when compared to a software implementation of the same function. Implementing custom instructions in hardware is perfectly possible and worth exploring.

Keywords: Customisation, FPGA, MIPS, partial reconfiguration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1143

758 Digital Filter for Cochlear Implant Implemented on a Field- Programmable Gate Array

Authors: Rekha V. Dundur , M.V.Latte, S.Y. Kulkarni, M.K.Venkatesha

Abstract:

The advent of multi-million gate Field Programmable Gate Arrays (FPGAs) with hardware support for multiplication opens an opportunity to recreate a significant portion of the front end of a human cochlea using this technology. In this paper we describe the implementation of the cochlear filter and show that it is entirely suited to a single device XC3S500 FPGA implementation .The filter gave a good fit to real time data with efficiency of hardware usage.

Keywords: Cochlea, FPGA, IIR (Infinite Impulse Response), Multiplier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2269

757 Digital Power Management Hardware Realization Using FPGA

Authors: Kar Foo Chong, Andreas Lee Astuti, Pradeep K. Gopalakrishnan, T. Hui Teo

Abstract:

This paper describes design of a digital feedback loop for a low switching frequency dc-dc switching converters. Low switching frequencies were selected in this design. A look up table for the digital PID (proportional integrator differentiator) compensator was implemented using Altera Stratix II with built-in ADC (analog-to-digital converter) to achieve this hardware realization. Design guidelines are given for the PID compensator, high frequency DPWM (digital pulse width modulator) and moving average filter.

Keywords: dc-dc converter, FPGA, PID, power management, .

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1953

756 FPGA-based Systems for Evolvable Hardware

Authors: Cyrille Lambert, Tatiana Kalganova, Emanuele Stomeo

Abstract:

Since 1992, year where Hugo de Garis has published the first paper on Evolvable Hardware (EHW), a period of intense creativity has followed. It has been actively researched, developed and applied to various problems. Different approaches have been proposed that created three main classifications: extrinsic, mixtrinsic and intrinsic EHW. Each of these solutions has a real interest. Nevertheless, although the extrinsic evolution generates some excellent results, the intrinsic systems are not so advanced. This paper suggests 3 possible solutions to implement the run-time configuration intrinsic EHW system: FPGA-based Run-Time Configuration system, JBits-based Run-Time Configuration system and Multi-board functional-level Run-Time Configuration system. The main characteristic of the proposed architectures is that they are implemented on Field Programmable Gate Array. A comparison of proposed solutions demonstrates that multi-board functional-level run-time configuration is superior in terms of scalability, flexibility and the implementation easiness.

Keywords: Evolvable hardware, evolutionary computation, FPGA systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2404

755 FPGA Implementation of RSA Encryption Algorithm for E-Passport Application

Authors: Khaled Shehata, Hanady Hussien, Sara Yehia

Abstract:

Securing the data stored on E-passport is a very important issue. RSA encryption algorithm is suitable for such application with low data size. In this paper the design and implementation of 1024 bit-key RSA encryption and decryption module on an FPGA is presented. The module is verified through comparing the result with that obtained from MATLAB tools. The design runs at a frequency of 36.3 MHz on Virtex-5 Xilinx FPGA. The key size is designed to be 1024-bit to achieve high security for the passport information. The whole design is achieved through VHDL design entry which makes it a portable design and can be directed to any hardware platform.

Keywords: RSA, VHDL, FPGA, modular multiplication, modular exponential.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5374

754 Hardware Prototyping of an Efficient Encryption Engine

Authors: Muhammad I. Ibrahimy, Mamun B.I. Reaz, Khandaker Asaduzzaman, Sazzad Hussain

Abstract:

An approach to develop the FPGA of a flexible key RSA encryption engine that can be used as a standard device in the secured communication system is presented. The VHDL modeling of this RSA encryption engine has the unique characteristics of supporting multiple key sizes, thus can easily be fit into the systems that require different levels of security. A simple nested loop addition and subtraction have been used in order to implement the RSA operation. This has made the processing time faster and used comparatively smaller amount of space in the FPGA. The hardware design is targeted on Altera STRATIX II device and determined that the flexible key RSA encryption engine can be best suited in the device named EP2S30F484C3. The RSA encryption implementation has made use of 13,779 units of logic elements and achieved a clock frequency of 17.77MHz. It has been verified that this RSA encryption engine can perform 32-bit, 256-bit and 1024-bit encryption operation in less than 41.585us, 531.515us and 790.61us respectively.

Keywords: RSA, FPGA, Communication, Security, VHDL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1403

753 Spacecraft Neural Network Control System Design using FPGA

Authors: Hanaa T. El-Madany, Faten H. Fahmy, Ninet M. A. El-Rahman, Hassen T. Dorrah

Abstract:

Designing and implementing intelligent systems has become a crucial factor for the innovation and development of better products of space technologies. A neural network is a parallel system, capable of resolving paradigms that linear computing cannot. Field programmable gate array (FPGA) is a digital device that owns reprogrammable properties and robust flexibility. For the neural network based instrument prototype in real time application, conventional specific VLSI neural chip design suffers the limitation in time and cost. With low precision artificial neural network design, FPGAs have higher speed and smaller size for real time application than the VLSI and DSP chips. So, many researchers have made great efforts on the realization of neural network (NN) using FPGA technique. In this paper, an introduction of ANN and FPGA technique are briefly shown. Also, Hardware Description Language (VHDL) code has been proposed to implement ANNs as well as to present simulation results with floating point arithmetic. Synthesis results for ANN controller are developed using Precision RTL. Proposed VHDL implementation creates a flexible, fast method and high degree of parallelism for implementing ANN. The implementation of multi-layer NN using lookup table LUT reduces the resource utilization for implementation and time for execution.

Keywords: Spacecraft, neural network, FPGA, VHDL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2964

752 Neural Network Implementation Using FPGA: Issues and Application

Authors: A. Muthuramalingam, S. Himavathi, E. Srinivasan

Abstract:

.Hardware realization of a Neural Network (NN), to a large extent depends on the efficient implementation of a single neuron. FPGA-based reconfigurable computing architectures are suitable for hardware implementation of neural networks. FPGA realization of ANNs with a large number of neurons is still a challenging task. This paper discusses the issues involved in implementation of a multi-input neuron with linear/nonlinear excitation functions using FPGA. Implementation method with resource/speed tradeoff is proposed to handle signed decimal numbers. The VHDL coding developed is tested using Xilinx XC V50hq240 Chip. To improve the speed of operation a lookup table method is used. The problems involved in using a lookup table (LUT) for a nonlinear function is discussed. The percentage saving in resource and the improvement in speed with an LUT for a neuron is reported. An attempt is also made to derive a generalized formula for a multi-input neuron that facilitates to estimate approximately the total resource requirement and speed achievable for a given multilayer neural network. This facilitates the designer to choose the FPGA capacity for a given application. Using the proposed method of implementation a neural network based application, namely, a Space vector modulator for a vector-controlled drive is presented

Keywords: FPGA implementation, multi-input neuron, neural network, nn based space vector modulator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4356

751 FPGA Implement of a Vision Based Lane Departure Warning System

Authors: Yu Ren Lin, Yi Feng Su

Abstract:

Using vision based solution in intelligent vehicle application often needs large memory to handle video stream and image process which increase complexity of hardware and software. In this paper, we present a FPGA implement of a vision based lane departure warning system. By taking frame of videos, the line gradient of line is estimated and the lane marks are found. By analysis the position of lane mark, departure of vehicle will be detected in time. This idea has been implemented in Xilinx Spartan6 FPGA. The lane departure warning system used 39% logic resources and no memory of the device. The average availability is 92.5%. The frame rate is more than 30 frames per second (fps).

Keywords: Lane departure warning system, image, FPGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2026

750 CPU Architecture Based on Static Hardware Scheduler Engine and Multiple Pipeline Registers

Authors: Ionel Zagan, Vasile Gheorghita Gaitan

Abstract:

The development of CPUs and of real-time systems based on them made it possible to use time at increasingly low resolutions. Together with the scheduling methods and algorithms, time organizing has been improved so as to respond positively to the need for optimization and to the way in which the CPU is used. This presentation contains both a detailed theoretical description and the results obtained from research on improving the performances of the nMPRA (Multi Pipeline Register Architecture) processor by implementing specific functions in hardware. The proposed CPU architecture has been developed, simulated and validated by using the FPGA Virtex-7 circuit, via a SoC project. Although the nMPRA processor hardware structure with five pipeline stages is very complex, the present paper presents and analyzes the tests dedicated to the implementation of the CPU and of the memory on-chip for instructions and data. In order to practically implement and test the entire SoC project, various tests have been performed. These tests have been performed in order to verify the drivers for peripherals and the boot module named Bootloader.

Keywords: Hardware scheduler, nMPRA processor, real-time systems, scheduling methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1040

749 Design and Implementation of TMS320C31 DSP and FPGA for Conventional Direct Torque Control (DTC) of Induction Machines

Authors: C. L. Toh, N. R. N. Idris, A. H. M. Yatim

Abstract:

This paper introduces a new digital logic design, which combines the DSP and FPGA to implement the conventional DTC of induction machine. The DSP will be used for floating point calculation whereas the FPGA main task is to implement the hysteresis-based controller. The emphasis is on FPGA digital logic design. The simulation and experimental results are presented and summarized.

Keywords: DTC, DSP, FPGA, induction machine

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1925

748 A Survey of Field Programmable Gate Array-Based Convolutional Neural Network Accelerators

Authors: Wei Zhang

Abstract:

With the rapid development of deep learning, neural network and deep learning algorithms play a significant role in various practical applications. Due to the high accuracy and good performance, Convolutional Neural Networks (CNNs) especially have become a research hot spot in the past few years. However, the size of the networks becomes increasingly large scale due to the demands of the practical applications, which poses a significant challenge to construct a high-performance implementation of deep learning neural networks. Meanwhile, many of these application scenarios also have strict requirements on the performance and low-power consumption of hardware devices. Therefore, it is particularly critical to choose a moderate computing platform for hardware acceleration of CNNs. This article aimed to survey the recent advance in Field Programmable Gate Array (FPGA)-based acceleration of CNNs. Various designs and implementations of the accelerator based on FPGA under different devices and network models are overviewed, and the versions of Graphic Processing Units (GPUs), Application Specific Integrated Circuits (ASICs) and Digital Signal Processors (DSPs) are compared to present our own critical analysis and comments. Finally, we give a discussion on different perspectives of these acceleration and optimization methods on FPGA platforms to further explore the opportunities and challenges for future research. More helpfully, we give a prospect for future development of the FPGA-based accelerator.

Keywords: Deep learning, field programmable gate array, FPGA, hardware acceleration, convolutional neural networks, CNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 814

747 FPGA Implementation of a Vision-Based Blind Spot Warning System

Authors: Yu Ren Lin, Yu Hong Li

Abstract:

Vision-based intelligent vehicle applications often require large amounts of memory to handle video streaming and image processing, which in turn increases complexity of hardware and software. This paper presents an FPGA implement of a vision-based blind spot warning system. Using video frames, the information of the blind spot area turns into one-dimensional information. Analysis of the estimated entropy of image allows the detection of an object in time. This idea has been implemented in the XtremeDSP video starter kit. The blind spot warning system uses only 13% of its logic resources and 95k bits block memory, and its frame rate is over 30 frames per sec (fps).

Keywords: blind-spot area, image, FPGA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1788

746 FPGA Implementation of the “PYRAMIDS“ Block Cipher

Authors: A. AlKalbany, H. Al hassan, M. Saeb

Abstract:

The “PYRAMIDS" Block Cipher is a symmetric encryption algorithm of a 64, 128, 256-bit length, that accepts a variable key length of 128, 192, 256 bits. The algorithm is an iterated cipher consisting of repeated applications of a simple round transformation with different operations and different sequence in each round. The algorithm was previously software implemented in Cµ code. In this paper, a hardware implementation of the algorithm, using Field Programmable Gate Arrays (FPGA), is presented. In this work, we discuss the algorithm, the implemented micro-architecture, and the simulation and implementation results. Moreover, we present a detailed comparison with other implemented standard algorithms. In addition, we include the floor plan as well as the circuit diagrams of the various micro-architecture modules.

Keywords: FPGA, VHDL, micro-architecture, encryption, cryptography, algorithm, data communication security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1654

745 Analysis of Lightweight Register Hardware Threat

Authors: Yang Luo, Beibei Wang

Abstract:

In this paper, we present a design methodology of lightweight register transfer level (RTL) hardware threat implemented based on a MAX II FPGA platform. The dynamic power consumed by the toggling of the various bit of registers as well as the dynamic power consumed per unit of logic circuits were analyzed. The hardware threat was designed taking advantage of the differences in dynamic power consumed per unit of logic circuits to hide the transfer information. The experiment result shows that the register hardware threat was successfully implemented by using different dynamic power consumed per unit of logic circuits to hide the key information of DES encryption module. It needs more than 100000 sample curves to reduce the background noise by comparing the sample space when it completely meets the time alignment requirement. In additional, an external trigger signal is playing a very important role to detect the hardware threat in this experiment.

Keywords: Side-channel analysis, hardware threat, register transfer level, dynamic power.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 946

744 High Level Synthesis of Digital Filters Based On Sub-Token Forwarding

Authors: Iyad F. Jafar, Sandra J. Alrawashdeh, Ban K. Alhamayel

Abstract:

High level synthesis (HLS) is a process which generates register-transfer level design for digital systems from behavioral description. There are many HLS algorithms and commercial tools. However, most of these algorithms consider a behavioral description for the system when a single token is presented to the system. This approach does not exploit extra hardware efficiently, especially in the design of digital filters where common operations may exist between successive tokens. In this paper, we modify the behavioral description to process multiple tokens in parallel. However, this approach is unlike the full processing that requires full hardware replication. It exploits the presence of common operations between successive tokens. The performance of the proposed approach is better than sequential processing and approaches that of full parallel processing as the hardware resources are increased.

Keywords: Digital filters, High level synthesis, Sub-token forwarding

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1412

743 Massively-Parallel Bit-Serial Neural Networks for Fast Epilepsy Diagnosis: A Feasibility Study

Authors: Si Mon Kueh, Tom J. Kazmierski

Abstract:

There are about 1% of the world population suffering from the hidden disability known as epilepsy and major developing countries are not fully equipped to counter this problem. In order to reduce the inconvenience and danger of epilepsy, different methods have been researched by using a artificial neural network (ANN) classification to distinguish epileptic waveforms from normal brain waveforms. This paper outlines the aim of achieving massive ANN parallelization through a dedicated hardware using bit-serial processing. The design of this bit-serial Neural Processing Element (NPE) is presented which implements the functionality of a complete neuron using variable accuracy. The proposed design has been tested taking into consideration non-idealities of a hardware ANN. The NPE consists of a bit-serial multiplier which uses only 16 logic elements on an Altera Cyclone IV FPGA and a bit-serial ALU as well as a look-up table. Arrays of NPEs can be driven by a single controller which executes the neural processing algorithm. In conclusion, the proposed compact NPE design allows the construction of complex hardware ANNs that can be implemented in a portable equipment that suits the needs of a single epileptic patient in his or her daily activities to predict the occurrences of impending tonic conic seizures.

Keywords: Artificial Neural Networks, bit-serial neural processor, FPGA, Neural Processing Element.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1526

742 Seamless MATLAB® to Register-Transfer Level Design Methodology Using High-Level Synthesis

Authors: Petri Solanti, Russell Klein

Abstract:

Many designers are asking for an automated path from an abstract mathematical MATLAB model to a high-quality Register-Transfer Level (RTL) hardware description. Manual transformations of MATLAB or intermediate code are needed, when the design abstraction is changed. Design conversion is problematic as it is multidimensional and it requires many different design steps to translate the mathematical representation of the desired functionality to an efficient hardware description with the same behavior and configurability. Yet, a manual model conversion is not an insurmountable task. Using currently available design tools and an appropriate design methodology, converting a MATLAB model to efficient hardware is a reasonable effort. This paper describes a simple and flexible design methodology that was developed together with several design teams.

Keywords: Design methodology, high-level synthesis, MATLAB, verification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 490