Search results for: Hardware accelerators
420 High Level Synthesis of Canny Edge Detection Algorithm on Zynq Platform
Authors: Hanaa M. Abdelgawad, Mona Safar, Ayman M. Wahba
Abstract:
Real time image and video processing is a demand in many computer vision applications, e.g. video surveillance, traffic management and medical imaging. The processing of those video applications requires high computational power. Thus, the optimal solution is the collaboration of CPU and hardware accelerators. In this paper, a Canny edge detection hardware accelerator is proposed. Edge detection is one of the basic building blocks of video and image processing applications. It is a common block in the pre-processing phase of image and video processing pipeline. Our presented approach targets offloading the Canny edge detection algorithm from processing system (PS) to programmable logic (PL) taking the advantage of High Level Synthesis (HLS) tool flow to accelerate the implementation on Zynq platform. The resulting implementation enables up to a 100x performance improvement through hardware acceleration. The CPU utilization drops down and the frame rate jumps to 60 fps of 1080p full HD input video stream.
Keywords: High Level Synthesis, Canny edge detection, Hardware accelerators, and Computer Vision.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5430419 Development of A Meta Description Language for Software/Hardware Cooperative Design and Verification for Model-Checking Systems
Authors: Katsumi Wasaki, Naoki Iwasaki
Abstract:
Model-checking tools such as Symbolic Model Verifier (SMV) and NuSMV are available for checking hardware designs. These tools can automatically check the formal legitimacy of a design. However, NuSMV is too low level for describing a complete hardware design. It is therefore necessary to translate the system definition, as designed in a language such as Verilog or VHDL, into a language such as NuSMV for validation. In this paper, we present a meta hardware description language, Melasy, that contains a code generator for existing hardware description languages (HDLs) and languages for model checking that solve this problem.Keywords: meta description language, software/hardware codesign, co-verification, formal verification, hardware compiler, modelchecking.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1463418 Hardware Approach to Solving Password Exposure Problem through Keyboard Sniff
Authors: Kyungroul Lee, Kwangjin Bae, Kangbin Yim
Abstract:
This paper introduces a hardware solution to password exposure problem caused by direct accesses to the keyboard hardware interfaces through which a possible attacker is able to grab user-s password even where existing countermeasures are deployed. Several researches have proposed reasonable software based solutions to the problem for years. However, recently introduced hardware vulnerability problems have neutralized the software approaches and yet proposed any effective software solution to the vulnerability. Hardware approach in this paper is expected as the only solution to the vulnerabilityKeywords: Keyboard sniff, password exposure, hardware vulnerability, privacy problem, insider security.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1577417 A Survey of Field Programmable Gate Array-Based Convolutional Neural Network Accelerators
Authors: Wei Zhang
Abstract:
With the rapid development of deep learning, neural network and deep learning algorithms play a significant role in various practical applications. Due to the high accuracy and good performance, Convolutional Neural Networks (CNNs) especially have become a research hot spot in the past few years. However, the size of the networks becomes increasingly large scale due to the demands of the practical applications, which poses a significant challenge to construct a high-performance implementation of deep learning neural networks. Meanwhile, many of these application scenarios also have strict requirements on the performance and low-power consumption of hardware devices. Therefore, it is particularly critical to choose a moderate computing platform for hardware acceleration of CNNs. This article aimed to survey the recent advance in Field Programmable Gate Array (FPGA)-based acceleration of CNNs. Various designs and implementations of the accelerator based on FPGA under different devices and network models are overviewed, and the versions of Graphic Processing Units (GPUs), Application Specific Integrated Circuits (ASICs) and Digital Signal Processors (DSPs) are compared to present our own critical analysis and comments. Finally, we give a discussion on different perspectives of these acceleration and optimization methods on FPGA platforms to further explore the opportunities and challenges for future research. More helpfully, we give a prospect for future development of the FPGA-based accelerator.Keywords: Deep learning, field programmable gate array, FPGA, hardware acceleration, convolutional neural networks, CNN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 892416 An Efficient Architecture for Interleaved Modular Multiplication
Authors: Ahmad M. Abdel Fattah, Ayman M. Bahaa El-Din, Hossam M.A. Fahmy
Abstract:
Modular multiplication is the basic operation in most public key cryptosystems, such as RSA, DSA, ECC, and DH key exchange. Unfortunately, very large operands (in order of 1024 or 2048 bits) must be used to provide sufficient security strength. The use of such big numbers dramatically slows down the whole cipher system, especially when running on embedded processors. So far, customized hardware accelerators - developed on FPGAs or ASICs - were the best choice for accelerating modular multiplication in embedded environments. On the other hand, many algorithms have been developed to speed up such operations. Examples are the Montgomery modular multiplication and the interleaved modular multiplication algorithms. Combining both customized hardware with an efficient algorithm is expected to provide a much faster cipher system. This paper introduces an enhanced architecture for computing the modular multiplication of two large numbers X and Y modulo a given modulus M. The proposed design is compared with three previous architectures depending on carry save adders and look up tables. Look up tables should be loaded with a set of pre-computed values. Our proposed architecture uses the same carry save addition, but replaces both look up tables and pre-computations with an enhanced version of sign detection techniques. The proposed architecture supports higher frequencies than other architectures. It also has a better overall absolute time for a single operation.Keywords: Montgomery multiplication, modular multiplication, efficient architecture, FPGA, RSA
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2451415 A Pipelined FSBM Hardware Architecture for HTDV-H.26x
Authors: H. Loukil, A. Ben Atitallah, F. Ghozzi, M. A. Ben Ayed, N. Masmoudi
Abstract:
In MPEG and H.26x standards, to eliminate the temporal redundancy we use motion estimation. Given that the motion estimation stage is very complex in terms of computational effort, a hardware implementation on a re-configurable circuit is crucial for the requirements of different real time multimedia applications. In this paper, we present hardware architecture for motion estimation based on "Full Search Block Matching" (FSBM) algorithm. This architecture presents minimum latency, maximum throughput, full utilization of hardware resources such as embedded memory blocks, and combining both pipelining and parallel processing techniques. Our design is described in VHDL language, verified by simulation and implemented in a Stratix II EP2S130F1020C4 FPGA circuit. The experiment result show that the optimum operating clock frequency of the proposed design is 89MHz which achieves 160M pixels/sec.Keywords: SAD, FSBM, Hardware Implementation, FPGA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1639414 Analysis of Lightweight Register Hardware Threat
Authors: Yang Luo, Beibei Wang
Abstract:
In this paper, we present a design methodology of lightweight register transfer level (RTL) hardware threat implemented based on a MAX II FPGA platform. The dynamic power consumed by the toggling of the various bit of registers as well as the dynamic power consumed per unit of logic circuits were analyzed. The hardware threat was designed taking advantage of the differences in dynamic power consumed per unit of logic circuits to hide the transfer information. The experiment result shows that the register hardware threat was successfully implemented by using different dynamic power consumed per unit of logic circuits to hide the key information of DES encryption module. It needs more than 100000 sample curves to reduce the background noise by comparing the sample space when it completely meets the time alignment requirement. In additional, an external trigger signal is playing a very important role to detect the hardware threat in this experiment.
Keywords: Side-channel analysis, hardware threat, register transfer level, dynamic power.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 991413 Analysis of Genotype Size for an Evolvable Hardware System
Authors: Emanuele Stomeo, Tatiana Kalganova, Cyrille Lambert
Abstract:
The evolution of logic circuits, which falls under the heading of evolvable hardware, is carried out by evolutionary algorithms. These algorithms are able to automatically configure reconfigurable devices. One of main difficulties in developing evolvable hardware with the ability to design functional electrical circuits is to choose the most favourable EA features such as fitness function, chromosome representations, population size, genetic operators and individual selection. Until now several researchers from the evolvable hardware community have used and tuned these parameters and various rules on how to select the value of a particular parameter have been proposed. However, to date, no one has presented a study regarding the size of the chromosome representation (circuit layout) to be used as a platform for the evolution in order to increase the evolvability, reduce the number of generations and optimize the digital logic circuits through reducing the number of logic gates. In this paper this topic has been thoroughly investigated and the optimal parameters for these EA features have been proposed. The evolution of logic circuits has been carried out by an extrinsic evolvable hardware system which uses (1+λ) evolution strategy as the core of the evolution.
Keywords: Evolvable hardware, genotype size, computational intelligence, design of logic circuits.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1660412 A Low-Area Fully-Reconfigurable Hardware Design of Fast Fourier Transform System for 3GPP-LTE Standard
Authors: Xin-Yu Shih, Yue-Qu Liu, Hong-Ru Chou
Abstract:
This paper presents a low-area and fully-reconfigurable Fast Fourier Transform (FFT) hardware design for 3GPP-LTE communication standard. It can fully support 32 different FFT sizes, up to 2048 FFT points. Besides, a special processing element is developed for making reconfigurable computing characteristics possible, while first-in first-out (FIFO) scheduling scheme design technique is proposed for hardware-friendly FIFO resource arranging. In a synthesis chip realization via TSMC 40 nm CMOS technology, the hardware circuit only occupies core area of 0.2325 mm2 and dissipates 233.5 mW at maximal operating frequency of 250 MHz.
Keywords: Reconfigurable, fast Fourier transform, single-path delay feedback, 3GPP-LTE.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1000411 An Efficient Hardware Implementation of Extended and Fast Physical Addressing in Microprocessor-Based Systems Using Programmable Logic
Authors: Mountassar Maamoun, Abdelhamid Meraghni, Abdelhalim Benbelkacem, Daoud Berkani
Abstract:
This paper describes an efficient hardware implementation of a new technique for interfacing the data exchange between the microprocessor-based systems and the external devices. This technique, based on the use of software/hardware system and a reduced physical address, enlarges the interfacing capacity of the microprocessor-based systems, uses the Direct Memory Access (DMA) to increases the frequency of the new bus, and improves the speed of data exchange. While using this architecture in microprocessor-based system or in computer, the input of the hardware part of our system will be connected to the bus system, and the output, which is a new bus, will be connected to an external device. The new bus is composed of a data bus, a control bus and an address bus. A Xilinx Integrated Software Environment (ISE) 7.1i has been used for the programmable logic implementation.
Keywords: Interfacing, Software/hardware System, CPLD, programmable logic, DMA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1384410 Multi-board Run-time Reconfigurable Implementation of Intrinsic Evolvable Hardware
Authors: Cyrille Lambert, Tatiana Kalganova, Emanuele Stomeo, Manissa Wilson
Abstract:
A multi-board run-time reconfigurable (MRTR) system for evolvable hardware (EHW) is introduced with the aim to implement on hardware the bidirectional incremental evolution (BIE) method. The main features of this digital intrinsic EHW solution rely on the multi-board approach, the variable chromosome length management and the partial configuration of the reconfigurable circuit. These three features provide a high scalability to the solution. The design has been written in VHDL with the concern of not being platform dependant in order to keep a flexibility factor as high as possible. This solution helps tackling the problem of evolving complex task on digital configurable support.Keywords: Evolvable Hardware, Evolutionary Strategy, multiboardFPGA system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1578409 Hardware Error Analysis and Severity Characterization in Linux-Based Server Systems
Authors: N. Georgoulopoulos, A. Hatzopoulos, K. Karamitsios, K. Kotrotsios, A. I. Metsai
Abstract:
Current server systems are responsible for critical applications that run in different infrastructures, such as the cloud, physical machines, and virtual machines. A common challenge that these systems face are the various hardware faults that may occur due to the high load, among other reasons, which translates to errors resulting in malfunctions or even server downtime. The most important hardware parts, that are causing most of the errors, are the CPU, RAM, and the hard drive - HDD. In this work, we investigate selected CPU, RAM, and HDD errors, observed or simulated in kernel ring buffer log files from GNU/Linux servers. Moreover, a severity characterization is given for each error type. Understanding these errors is crucial for the efficient analysis of kernel logs that are usually utilized for monitoring servers and diagnosing faults. In addition, to support the previous analysis, we present possible ways of simulating hardware errors in RAM and HDD, aiming to facilitate the testing of methods for detecting and tackling the above issues in a server running on GNU/Linux.
Keywords: hardware errors, Kernel logs, GNU/Linux servers, RAM, HDD, CPU
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 680408 The Hardware Implementation of a Novel Genetic Algorithm
Authors: Zhenhuan Zhu, David Mulvaney, Vassilios Chouliaras
Abstract:
This paper presents a novel genetic algorithm, termed the Optimum Individual Monogenetic Algorithm (OIMGA) and describes its hardware implementation. As the monogenetic strategy retains only the optimum individual, the memory requirement is dramatically reduced and no crossover circuitry is needed, thereby ensuring the requisite silicon area is kept to a minimum. Consequently, depending on application requirements, OIMGA allows the investigation of solutions that warrant either larger GA populations or individuals of greater length. The results given in this paper demonstrate that both the performance of OIMGA and its convergence time are superior to those of existing hardware GA implementations. Local convergence is achieved in OIMGA by retaining elite individuals, while population diversity is ensured by continually searching for the best individuals in fresh regions of the search space.Keywords: Genetic algorithms, hardware-based machinelearning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1639407 Individual Actuators of a Car-Like Robot with Back Trailer
Authors: Tarek M. Nazih El-Derini, Ahmed K. El-Shenawy
Abstract:
This paper presents the hardware implemented and validation for a special system to assist the unprofessional users of car with back trailers. The system consists of two platforms; the front car platform (C) and the trailer platform (T). The main objective is to control the Trailer platform using the actuators found in the front platform (c). The mobility of the platform (C) is investigated and inverse and forward kinematics model is obtained for both platforms (C) and (T).The system is simulated using Matlab M-file and the simulation examples results illustrated the system performance. The system is constructed with a hardware setup for the front and trailer platform. The hardware experimental results and the simulated examples outputs showed the validation of the hardware setup.
Keywords: Kinematics, Modeling, Wheeled Mobile Robot.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2308406 Generational PipeLined Genetic Algorithm (PLGA)using Stochastic Selection
Authors: Malay K. Pakhira, Rajat K. De
Abstract:
In this paper, a pipelined version of genetic algorithm, called PLGA, and a corresponding hardware platform are described. The basic operations of conventional GA (CGA) are made pipelined using an appropriate selection scheme. The selection operator, used here, is stochastic in nature and is called SA-selection. This helps maintaining the basic generational nature of the proposed pipelined GA (PLGA). A number of benchmark problems are used to compare the performances of conventional roulette-wheel selection and the SA-selection. These include unimodal and multimodal functions with dimensionality varying from very small to very large. It is seen that the SA-selection scheme is giving comparable performances with respect to the classical roulette-wheel selection scheme, for all the instances, when quality of solutions and rate of convergence are considered. The speedups obtained by PLGA for different benchmarks are found to be significant. It is shown that a complete hardware pipeline can be developed using the proposed scheme, if parallel evaluation of the fitness expression is possible. In this connection a low-cost but very fast hardware evaluation unit is described. Results of simulation experiments show that in a pipelined hardware environment, PLGA will be much faster than CGA. In terms of efficiency, PLGA is found to outperform parallel GA (PGA) also.Keywords: Hardware evaluation, Hardware pipeline, Optimization, Pipelined genetic algorithm, SA-selection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1442405 Efficient Hardware Architecture of the Direct 2- D Transform for the HEVC Standard
Authors: Fatma Belghith, Hassen Loukil, Nouri Masmoudi
Abstract:
This paper presents the hardware design of a unified architecture to compute the 4x4, 8x8 and 16x16 efficient twodimensional (2-D) transform for the HEVC standard. This architecture is based on fast integer transform algorithms. It is designed only with adders and shifts in order to reduce the hardware cost significantly. The goal is to ensure the maximum circuit reuse during the computing while saving 40% for the number of operations. The architecture is developed using FIFOs to compute the second dimension. The proposed hardware was implemented in VHDL. The VHDL RTL code works at 240 MHZ in an Altera Stratix III FPGA. The number of cycles in this architecture varies from 33 in 4-point- 2D-DCT to 172 when the 16-point-2D-DCT is computed. Results show frequency improvements reaching 96% when compared to an architecture described as the direct transcription of the algorithm.Keywords: HEVC, Modified Integer Transform, FPGA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2747404 A Novel Genetic Algorithm Designed for Hardware Implementation
Authors: Zhenhuan Zhu, David Mulvaney, Vassilios Chouliaras
Abstract:
A new genetic algorithm, termed the 'optimum individual monogenetic genetic algorithm' (OIMGA), is presented whose properties have been deliberately designed to be well suited to hardware implementation. Specific design criteria were to ensure fast access to the individuals in the population, to keep the required silicon area for hardware implementation to a minimum and to incorporate flexibility in the structure for the targeting of a range of applications. The first two criteria are met by retaining only the current optimum individual, thereby guaranteeing a small memory requirement that can easily be stored in fast on-chip memory. Also, OIMGA can be easily reconfigured to allow the investigation of problems that normally warrant either large GA populations or individuals many genes in length. Local convergence is achieved in OIMGA by retaining elite individuals, while population diversity is ensured by continually searching for the best individuals in fresh regions of the search space. The results given in this paper demonstrate that both the performance of OIMGA and its convergence time are superior to those of a range of existing hardware GA implementations.
Keywords: Genetic algorithms, genetic hardware, machinelearning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2023403 Hardware-in-the-Loop Test for Automatic Voltage Regulator of Synchronous Condenser
Authors: Ha Thi Nguyen, Guangya Yang, Arne Hejde Nielsen, Peter Højgaard Jensen
Abstract:
Automatic voltage regulator (AVR) plays an important role in volt/var control of synchronous condenser (SC) in power systems. Test AVR performance in steady-state and dynamic conditions in real grid is expensive, low efficiency, and hard to achieve. To address this issue, we implement hardware-in-the-loop (HiL) test for the AVR of SC to test the steady-state and dynamic performances of AVR in different operating conditions. Startup procedure of the system and voltage set point changes are studied to evaluate the AVR hardware response. Overexcitation, underexcitation, and AVR set point loss are tested to compare the performance of SC with the AVR hardware and that of simulation. The comparative results demonstrate how AVR will work in a real system. The results show HiL test is an effective approach for testing devices before deployment and is able to parameterize the controller with lower cost, higher efficiency, and more flexibility.Keywords: Automatic voltage regulator, hardware-in-the-loop, synchronous condenser, real time digital simulator.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1098402 Hardware Implementation of Local Binary Pattern Based Two-Bit Transform Motion Estimation
Authors: Seda Yavuz, Anıl Çelebi, Aysun Taşyapı Çelebi, Oğuzhan Urhan
Abstract:
Nowadays, demand for using real-time video transmission capable devices is ever-increasing. So, high resolution videos have made efficient video compression techniques an essential component for capturing and transmitting video data. Motion estimation has a critical role in encoding raw video. Hence, various motion estimation methods are introduced to efficiently compress the video. Low bit‑depth representation based motion estimation methods facilitate computation of matching criteria and thus, provide small hardware footprint. In this paper, a hardware implementation of a two-bit transformation based low-complexity motion estimation method using local binary pattern approach is proposed. Image frames are represented in two-bit depth instead of full-depth by making use of the local binary pattern as a binarization approach and the binarization part of the hardware architecture is explained in detail. Experimental results demonstrate the difference between the proposed hardware architecture and the architectures of well-known low-complexity motion estimation methods in terms of important aspects such as resource utilization, energy and power consumption.
Keywords: Binarization, hardware architecture, local binary pattern, motion estimation, two-bit transform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1373401 Adaptive Multiple Transforms Hardware Architecture for Versatile Video Coding
Authors: T. Damak, S. Houidi, M. A. Ben Ayed, N. Masmoudi
Abstract:
The Versatile Video Coding standard (VVC) is actually under development by the Joint Video Exploration Team (or JVET). An Adaptive Multiple Transforms (AMT) approach was announced. It is based on different transform modules that provided an efficient coding. However, the AMT solution raises several issues especially regarding the complexity of the selected set of transforms. This can be an important issue, particularly for a future industrial adoption. This paper proposed an efficient hardware implementation of the most used transform in AMT approach: the DCT II. The developed circuit is adapted to different block sizes and can reach a minimum frequency of 192 MHz allowing an optimized execution time.
Keywords: AMT, DCT II, hardware, transform, VVC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 580400 Hardware Centric Machine Vision for High Precision Center of Gravity Calculation
Authors: Xin Cheng, Benny Thörnberg, Abdul Waheed Malik, Najeem Lawal
Abstract:
We present a hardware oriented method for real-time measurements of object-s position in video. The targeted application area is light spots used as references for robotic navigation. Different algorithms for dynamic thresholding are explored in combination with component labeling and Center Of Gravity (COG) for highest possible precision versus Signal-to-Noise Ratio (SNR). This method was developed with a low hardware cost in focus having only one convolution operation required for preprocessing of data.Keywords: Dynamic thresholding, segmentation, position measurement, sub-pixel precision, center of gravity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2352399 Application of Hardware Efficient CIC Compensation Filter in Narrow Band Filtering
Authors: Vishal Awasthi, Krishna Raj
Abstract:
In many communication and signal processing systems, it is highly desirable to implement an efficient narrow-band filter that decimate or interpolate the incoming signals. This paper presents hardware efficient compensated CIC filter over a narrow band frequency that increases the speed of down sampling by using multiplierless decimation filters with polyphase FIR filter structure. The proposed work analyzed the performance of compensated CIC filter on the bases of the improvement of frequency response with reduced hardware complexity in terms of no. of adders and multipliers and produces the filtered results without any alterations. CIC compensator filter demonstrated that by using compensation with CIC filter improve the frequency response in passed of interest 26.57% with the reduction in hardware complexity 12.25% multiplications per input sample (MPIS) and 23.4% additions per input sample (APIS) w.r.t. FIR filter respectively.
Keywords: Multirate filtering, Narrow-band Signaling, Compensation Theory, CIC filter, Decimation, Compensation filter.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2948398 2-D Realization of WiMAX Channel Interleaver for Efficient Hardware Implementation
Authors: Rizwan Asghar, Dake Liu
Abstract:
The direct implementation of interleaver functions in WiMAX is not hardware efficient due to presence of complex functions. Also the conventional method i.e. using memories for storing the permutation tables is silicon consuming. This work presents a 2-D transformation for WiMAX channel interleaver functions which reduces the overall hardware complexity to compute the interleaver addresses on the fly. A fully reconfigurable architecture for address generation in WiMAX channel interleaver is presented, which consume 1.1 k-gates in total. It can be configured for any block size and any modulation scheme in WiMAX. The presented architecture can run at a frequency of 200 MHz, thus fully supporting high bandwidth requirements for WiMAX.Keywords: Interleaver, deinterleaver, WiMAX, 802.16e.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2312397 On the Design of Electronic Control Unitsfor the Safety-Critical Vehicle Applications
Authors: Kyung-Jung Lee, Hyun-Sik Ahn
Abstract:
This paper suggests a design methodology for the hardware and software of the electronic control unit (ECU) of safety-critical vehicle applications such as braking and steering. The architecture of the hardware is a high integrity system such thatit incorporates a high performance 32-bit CPU and a separate peripheral controlprocessor (PCP) together with an external watchdog CPU. Communication between the main CPU and the PCP is executed via a common area of RAM and events on either processor which are invoked by interrupts. Safety-related software is also implemented to provide a reliable, self-testing computing environment for safety critical and high integrity applications. The validity of the design approach is shown by using the hardware-in-the-loop simulation (HILS)for electric power steering(EPS) systemswhich consists of the EPS mechanism, the designed ECU, and monitoring tools.
Keywords: Electronic control unit, electric power steering, functional safety, hardware-in-the-loop simulation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3368396 Design of a Neural Networks Classifier for Face Detection
Authors: F. Smach, M. Atri, J. Mitéran, M. Abid
Abstract:
Face detection and recognition has many applications in a variety of fields such as security system, videoconferencing and identification. Face classification is currently implemented in software. A hardware implementation allows real-time processing, but has higher cost and time to-market. The objective of this work is to implement a classifier based on neural networks MLP (Multi-layer Perceptron) for face detection. The MLP is used to classify face and non-face patterns. The systm is described using C language on a P4 (2.4 Ghz) to extract weight values. Then a Hardware implementation is achieved using VHDL based Methodology. We target Xilinx FPGA as the implementation support.Keywords: Classification, Face Detection, FPGA Hardware description, MLP.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2280395 Hardware Implementations for the ISO/IEC 18033-4:2005 Standard for Stream Ciphers
Authors: Paris Kitsos
Abstract:
In this paper the FPGA implementations for four stream ciphers are presented. The two stream ciphers, MUGI and SNOW 2.0 are recently adopted by the International Organization for Standardization ISO/IEC 18033-4:2005 standard. The other two stream ciphers, MICKEY 128 and TRIVIUM have been submitted and are under consideration for the eSTREAM, the ECRYPT (European Network of Excellence for Cryptology) Stream Cipher project. All ciphers were coded using VHDL language. For the hardware implementation, an FPGA device was used. The proposed implementations achieve throughputs range from 166 Mbps for MICKEY 128 to 6080 Mbps for MUGI.Keywords: Cryptography, ISO/IEC 18033-4:2005 standard, Hardware implementation, Stream ciphers
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1797394 A Framework for Product Development Process including HW and SW Components
Authors: Namchul Do, Gyeongseok Chae
Abstract:
This paper proposes a framework for product development including hardware and software components. It provides separation of hardware dependent software, modifications of current product development process, and integration of software modules with existing product configuration models and assembly product structures. In order to decide the dependent software, the framework considers product configuration modules and engineering changes of associated software and hardware components. In order to support efficient integration of the two different hardware and software development, a modified product development process is proposed. The process integrates the dependent software development into product development through the interchanges of specific product information. By using existing product data models in Product Data Management (PDM), the framework represents software as modules for product configurations and software parts for product structure. The framework is applied to development of a robot system in order to show its effectiveness.Keywords: HW and SW Development Integration, ProductDevelopment with Software.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2600393 Local Linear Model Tree (LOLIMOT) Reconfigurable Parallel Hardware
Authors: A. Pedram, M. R. Jamali, T. Pedram, S. M. Fakhraie, C. Lucas
Abstract:
Local Linear Neuro-Fuzzy Models (LLNFM) like other neuro- fuzzy systems are adaptive networks and provide robust learning capabilities and are widely utilized in various applications such as pattern recognition, system identification, image processing and prediction. Local linear model tree (LOLIMOT) is a type of Takagi-Sugeno-Kang neuro fuzzy algorithm which has proven its efficiency compared with other neuro fuzzy networks in learning the nonlinear systems and pattern recognition. In this paper, a dedicated reconfigurable and parallel processing hardware for LOLIMOT algorithm and its applications are presented. This hardware realizes on-chip learning which gives it the capability to work as a standalone device in a system. The synthesis results on FPGA platforms show its potential to improve the speed at least 250 of times faster than software implemented algorithms.
Keywords: LOLIMOT, hardware, neurofuzzy systems, reconfigurable, parallel.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3886392 CPU Architecture Based on Static Hardware Scheduler Engine and Multiple Pipeline Registers
Authors: Ionel Zagan, Vasile Gheorghita Gaitan
Abstract:
The development of CPUs and of real-time systems based on them made it possible to use time at increasingly low resolutions. Together with the scheduling methods and algorithms, time organizing has been improved so as to respond positively to the need for optimization and to the way in which the CPU is used. This presentation contains both a detailed theoretical description and the results obtained from research on improving the performances of the nMPRA (Multi Pipeline Register Architecture) processor by implementing specific functions in hardware. The proposed CPU architecture has been developed, simulated and validated by using the FPGA Virtex-7 circuit, via a SoC project. Although the nMPRA processor hardware structure with five pipeline stages is very complex, the present paper presents and analyzes the tests dedicated to the implementation of the CPU and of the memory on-chip for instructions and data. In order to practically implement and test the entire SoC project, various tests have been performed. These tests have been performed in order to verify the drivers for peripherals and the boot module named Bootloader.
Keywords: Hardware scheduler, nMPRA processor, real-time systems, scheduling methods.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1093391 Efficient Pipelined Hardware Implementation of RIPEMD-160 Hash Function
Authors: H. E. Michail, V. N. Thanasoulis, G. A. Panagiotakopoulos, A. P. Kakarountas, C. E. Goutis
Abstract:
In this paper an efficient implementation of Ripemd- 160 hash function is presented. Hash functions are a special family of cryptographic algorithms, which is used in technological applications with requirements for security, confidentiality and validity. Applications like PKI, IPSec, DSA, MAC-s incorporate hash functions and are used widely today. The Ripemd-160 is emanated from the necessity for existence of very strong algorithms in cryptanalysis. The proposed hardware implementation can be synthesized easily for a variety of FPGA and ASIC technologies. Simulation results, using commercial tools, verified the efficiency of the implementation in terms of performance and throughput. Special care has been taken so that the proposed implementation doesn-t introduce extra design complexity; while in parallel functionality was kept to the required levels.Keywords: Hardware implementation, hash functions, Ripemd-160, security.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1894