Search results for: hardware architecture
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1213

Search results for: hardware architecture

1213 Efficient Hardware Architecture of the Direct 2- D Transform for the HEVC Standard

Authors: Fatma Belghith, Hassen Loukil, Nouri Masmoudi

Abstract:

This paper presents the hardware design of a unified architecture to compute the 4x4, 8x8 and 16x16 efficient twodimensional (2-D) transform for the HEVC standard. This architecture is based on fast integer transform algorithms. It is designed only with adders and shifts in order to reduce the hardware cost significantly. The goal is to ensure the maximum circuit reuse during the computing while saving 40% for the number of operations. The architecture is developed using FIFOs to compute the second dimension. The proposed hardware was implemented in VHDL. The VHDL RTL code works at 240 MHZ in an Altera Stratix III FPGA. The number of cycles in this architecture varies from 33 in 4-point- 2D-DCT to 172 when the 16-point-2D-DCT is computed. Results show frequency improvements reaching 96% when compared to an architecture described as the direct transcription of the algorithm.

Keywords: HEVC, Modified Integer Transform, FPGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2697
1212 A Pipelined FSBM Hardware Architecture for HTDV-H.26x

Authors: H. Loukil, A. Ben Atitallah, F. Ghozzi, M. A. Ben Ayed, N. Masmoudi

Abstract:

In MPEG and H.26x standards, to eliminate the temporal redundancy we use motion estimation. Given that the motion estimation stage is very complex in terms of computational effort, a hardware implementation on a re-configurable circuit is crucial for the requirements of different real time multimedia applications. In this paper, we present hardware architecture for motion estimation based on "Full Search Block Matching" (FSBM) algorithm. This architecture presents minimum latency, maximum throughput, full utilization of hardware resources such as embedded memory blocks, and combining both pipelining and parallel processing techniques. Our design is described in VHDL language, verified by simulation and implemented in a Stratix II EP2S130F1020C4 FPGA circuit. The experiment result show that the optimum operating clock frequency of the proposed design is 89MHz which achieves 160M pixels/sec.

Keywords: SAD, FSBM, Hardware Implementation, FPGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1597
1211 Hardware Implementation of Local Binary Pattern Based Two-Bit Transform Motion Estimation

Authors: Seda Yavuz, Anıl Çelebi, Aysun Taşyapı Çelebi, Oğuzhan Urhan

Abstract:

Nowadays, demand for using real-time video transmission capable devices is ever-increasing. So, high resolution videos have made efficient video compression techniques an essential component for capturing and transmitting video data. Motion estimation has a critical role in encoding raw video. Hence, various motion estimation methods are introduced to efficiently compress the video. Low bit‑depth representation based motion estimation methods facilitate computation of matching criteria and thus, provide small hardware footprint. In this paper, a hardware implementation of a two-bit transformation based low-complexity motion estimation method using local binary pattern approach is proposed. Image frames are represented in two-bit depth instead of full-depth by making use of the local binary pattern as a binarization approach and the binarization part of the hardware architecture is explained in detail. Experimental results demonstrate the difference between the proposed hardware architecture and the architectures of well-known low-complexity motion estimation methods in terms of important aspects such as resource utilization, energy and power consumption.

Keywords: Binarization, hardware architecture, local binary pattern, motion estimation, two-bit transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1327
1210 2-D Realization of WiMAX Channel Interleaver for Efficient Hardware Implementation

Authors: Rizwan Asghar, Dake Liu

Abstract:

The direct implementation of interleaver functions in WiMAX is not hardware efficient due to presence of complex functions. Also the conventional method i.e. using memories for storing the permutation tables is silicon consuming. This work presents a 2-D transformation for WiMAX channel interleaver functions which reduces the overall hardware complexity to compute the interleaver addresses on the fly. A fully reconfigurable architecture for address generation in WiMAX channel interleaver is presented, which consume 1.1 k-gates in total. It can be configured for any block size and any modulation scheme in WiMAX. The presented architecture can run at a frequency of 200 MHz, thus fully supporting high bandwidth requirements for WiMAX.

Keywords: Interleaver, deinterleaver, WiMAX, 802.16e.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2253
1209 High Performance VLSI Architecture of 2D Discrete Wavelet Transform with Scalable Lattice Structure

Authors: Juyoung Kim, Taegeun Park

Abstract:

In this paper, we propose a fully-utilized, block-based 2D DWT (discrete wavelet transform) architecture, which consists of four 1D DWT filters with two-channel QMF lattice structure. The proposed architecture requires about 2MN-3N registers to save the intermediate results for higher level decomposition, where M and N stand for the filter length and the row width of the image respectively. Furthermore, the proposed 2D DWT processes in horizontal and vertical directions simultaneously without an idle period, so that it computes the DWT for an N×N image in a period of N2(1-2-2J)/3. Compared to the existing approaches, the proposed architecture shows 100% of hardware utilization and high throughput rates. To mitigate the long critical path delay due to the cascaded lattices, we can apply the pipeline technique with four stages, while retaining 100% of hardware utilization. The proposed architecture can be applied in real-time video signal processing.

Keywords: discrete wavelet transform, VLSI architecture, QMF lattice filter, pipelining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1732
1208 CPU Architecture Based on Static Hardware Scheduler Engine and Multiple Pipeline Registers

Authors: Ionel Zagan, Vasile Gheorghita Gaitan

Abstract:

The development of CPUs and of real-time systems based on them made it possible to use time at increasingly low resolutions. Together with the scheduling methods and algorithms, time organizing has been improved so as to respond positively to the need for optimization and to the way in which the CPU is used. This presentation contains both a detailed theoretical description and the results obtained from research on improving the performances of the nMPRA (Multi Pipeline Register Architecture) processor by implementing specific functions in hardware. The proposed CPU architecture has been developed, simulated and validated by using the FPGA Virtex-7 circuit, via a SoC project. Although the nMPRA processor hardware structure with five pipeline stages is very complex, the present paper presents and analyzes the tests dedicated to the implementation of the CPU and of the memory on-chip for instructions and data. In order to practically implement and test the entire SoC project, various tests have been performed. These tests have been performed in order to verify the drivers for peripherals and the boot module named Bootloader.

Keywords: Hardware scheduler, nMPRA processor, real-time systems, scheduling methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1039
1207 Low Complexity Multi Mode Interleaver Core for WiMAX with Support for Convolutional Interleaving

Authors: Rizwan Asghar, Dake Liu

Abstract:

A hardware efficient, multi mode, re-configurable architecture of interleaver/de-interleaver for multiple standards, like DVB, WiMAX and WLAN is presented. The interleavers consume a large part of silicon area when implemented by using conventional methods as they use memories to store permutation patterns. In addition, different types of interleavers in different standards cannot share the hardware due to different construction methodologies. The novelty of the work presented in this paper is threefold: 1) Mapping of vital types of interleavers including convolutional interleaver onto a single architecture with flexibility to change interleaver size; 2) Hardware complexity for channel interleaving in WiMAX is reduced by using 2-D realization of the interleaver functions; and 3) Silicon cost overheads reduced by avoiding the use of small memories. The proposed architecture consumes 0.18mm2 silicon area for 0.12μm process and can operate at a frequency of 140 MHz. The reduced complexity helps in minimizing the memory utilization, and at the same time provides strong support to on-the-fly computation of permutation patterns.

Keywords: Hardware interleaver implementation, WiMAX, DVB, block interleaver, convolutional interleaver, hardwaremultiplexing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1984
1206 Improving the Performances of the nMPRA Architecture by Implementing Specific Functions in Hardware

Authors: Ionel Zagan, Vasile Gheorghita Gaitan

Abstract:

Minimizing the response time to asynchronous events in a real-time system is an important factor in increasing the speed of response and an interesting concept in designing equipment fast enough for the most demanding applications. The present article will present the results regarding the validation of the nMPRA (Multi Pipeline Register Architecture) architecture using the FPGA Virtex-7 circuit. The nMPRA concept is a hardware processor with the scheduler implemented at the processor level; this is done without affecting a possible bus communication, as is the case with the other CPU solutions. The implementation of static or dynamic scheduling operations in hardware and the improvement of handling interrupts and events by the real-time executive described in the present article represent a key solution for eliminating the overhead of the operating system functions. The nMPRA processor is capable of executing a preemptive scheduling, using various algorithms without a software scheduler. Therefore, we have also presented various scheduling methods and algorithms used in scheduling the real-time tasks.

Keywords: nMPRA architecture, pipeline processor, preemptive scheduling, real-time system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 830
1205 Design of Multi-disease Diagnosis Processor using Hypernetworks Technique

Authors: Jae-Yeon Song, Seung-Yerl Lee, Kyu-Yeul Wang, Byung-Soo Kim, Sang-Seol Lee, Seong-Seob Shin, Jae-Young Choi, Chong Ho Lee, Jeahyun Park, Duck-Jin Chung

Abstract:

In this paper, we propose disease diagnosis hardware architecture by using Hypernetworks technique. It can be used to diagnose 3 different diseases (SPECT Heart, Leukemia, Prostate cancer). Generally, the disparate diseases require specified diagnosis hardware model for each disease. Using similarities of three diseases diagnosis processor, we design diagnosis processor that can diagnose three different diseases. Our proposed architecture that is combining three processors to one processor can reduce hardware size without decrease of the accuracy.

Keywords: Diagnosis processor, Hypernetworks, Leukemia, Mask, Prostate cancer, SPECT Heart data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1311
1204 Grid Learning; Computer Grid Joins to e- Learning

Authors: A. Nassiry, A. Kardan

Abstract:

According to development of communications and web-based technologies in recent years, e-Learning has became very important for everyone and is seen as one of most dynamic teaching methods. Grid computing is a pattern for increasing of computing power and storage capacity of a system and is based on hardware and software resources in a network with common purpose. In this article we study grid architecture and describe its different layers. In this way, we will analyze grid layered architecture. Then we will introduce a new suitable architecture for e-Learning which is based on grid network, and for this reason we call it Grid Learning Architecture. Various sections and layers of suggested architecture will be analyzed; especially grid middleware layer that has key role. This layer is heart of grid learning architecture and, in fact, regardless of this layer, e-Learning based on grid architecture will not be feasible.

Keywords: Distributed learning, Grid Learning, Grid network, SCORM standard.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1669
1203 FPGA Hardware Implementation and Evaluation of a Micro-Network Architecture for Multi-Core Systems

Authors: Yahia Salah, Med Lassaad Kaddachi, Rached Tourki

Abstract:

This paper presents the design, implementation and evaluation of a micro-network, or Network-on-Chip (NoC), based on a generic pipeline router architecture. The router is designed to efficiently support traffic generated by multimedia applications on embedded multi-core systems. It employs a simplest routing mechanism and implements the round-robin scheduling strategy to resolve output port contentions and minimize latency. A virtual channel flow control is applied to avoid the head-of-line blocking problem and enhance performance in the NoC. The hardware design of the router architecture has been implemented at the register transfer level; its functionality is evaluated in the case of the two dimensional Mesh/Torus topology, and performance results are derived from ModelSim simulator and Xilinx ISE 9.2i synthesis tool. An example of a multi-core image processing system utilizing the NoC structure has been implemented and validated to demonstrate the capability of the proposed micro-network architecture. To reduce complexity of the image compression and decompression architecture, the system use image processing algorithm based on classical discrete cosine transform with an efficient zonal processing approach. The experimental results have confirmed that both the proposed image compression scheme and NoC architecture can achieve a reasonable image quality with lower processing time.

Keywords: Generic Pipeline Network-on-Chip Router Architecture, JPEG Image Compression, FPGA Hardware Implementation, Performance Evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3048
1202 FPGA Implementation of RSA Cryptosystem

Authors: Ridha Ghayoula, ElAmjed Hajlaoui, Talel Korkobi, Mbarek Traii, Hichem Trabelsi

Abstract:

In this paper, the hardware implementation of the RSA public-key cryptographic algorithm is presented. The RSA cryptographic algorithm is depends on the computation of repeated modular exponentials. The Montgomery algorithm is used and modified to reduce hardware resources and to achieve reasonable operating speed for FPGA. An efficient architecture for modular multiplications based on the array multiplier is proposed. We have implemented a RSA cryptosystem based on Montgomery algorithm. As a result, it is shown that proposed architecture contributes to small area and reasonable speed.

Keywords: RSA, Cryptosystem, Montgomery, Implementation.FPGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2367
1201 An Efficient Hardware Implementation of Extended and Fast Physical Addressing in Microprocessor-Based Systems Using Programmable Logic

Authors: Mountassar Maamoun, Abdelhamid Meraghni, Abdelhalim Benbelkacem, Daoud Berkani

Abstract:

This paper describes an efficient hardware implementation of a new technique for interfacing the data exchange between the microprocessor-based systems and the external devices. This technique, based on the use of software/hardware system and a reduced physical address, enlarges the interfacing capacity of the microprocessor-based systems, uses the Direct Memory Access (DMA) to increases the frequency of the new bus, and improves the speed of data exchange. While using this architecture in microprocessor-based system or in computer, the input of the hardware part of our system will be connected to the bus system, and the output, which is a new bus, will be connected to an external device. The new bus is composed of a data bus, a control bus and an address bus. A Xilinx Integrated Software Environment (ISE) 7.1i has been used for the programmable logic implementation.

Keywords: Interfacing, Software/hardware System, CPLD, programmable logic, DMA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1350
1200 Design of Low-Area HEVC Core Transform Architecture

Authors: Seung-Mok Han, Woo-Jin Nam, Seongsoo Lee

Abstract:

This paper proposes and implements an core transform architecture, which is one of the major processes in HEVC video compression standard. The proposed core transform architecture is implemented with only adders and shifters instead of area-consuming multipliers. Shifters in the proposed core transform architecture are implemented in wires and multiplexers, which significantly reduces chip area. Also, it can process from 4×4 to 16×16 blocks with common hardware by reusing processing elements. Designed core transform architecture in 0.13um technology can process a 16×16 block with 2-D transform in 130 cycles, and its gate count is 101,015 gates.

Keywords: HEVC, Core transform, Low area, Shift-and-add, PE reuse

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1872
1199 Adaptive Multiple Transforms Hardware Architecture for Versatile Video Coding

Authors: T. Damak, S. Houidi, M. A. Ben Ayed, N. Masmoudi

Abstract:

The Versatile Video Coding standard (VVC) is actually under development by the Joint Video Exploration Team (or JVET). An Adaptive Multiple Transforms (AMT) approach was announced. It is based on different transform modules that provided an efficient coding. However, the AMT solution raises several issues especially regarding the complexity of the selected set of transforms. This can be an important issue, particularly for a future industrial adoption. This paper proposed an efficient hardware implementation of the most used transform in AMT approach: the DCT II. The developed circuit is adapted to different block sizes and can reach a minimum frequency of 192 MHz allowing an optimized execution time.

Keywords: AMT, DCT II, hardware, transform, VVC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 522
1198 FPGA Based Parallel Architecture for the Computation of Third-Order Cross Moments

Authors: Syed Manzoor Qasim, Shuja Abbasi, Saleh Alshebeili, Bandar Almashary, Ateeq Ahmad Khan

Abstract:

Higher-order Statistics (HOS), also known as cumulants, cross moments and their frequency domain counterparts, known as poly spectra have emerged as a powerful signal processing tool for the synthesis and analysis of signals and systems. Algorithms used for the computation of cross moments are computationally intensive and require high computational speed for real-time applications. For efficiency and high speed, it is often advantageous to realize computation intensive algorithms in hardware. A promising solution that combines high flexibility together with the speed of a traditional hardware is Field Programmable Gate Array (FPGA). In this paper, we present FPGA-based parallel architecture for the computation of third-order cross moments. The proposed design is coded in Very High Speed Integrated Circuit (VHSIC) Hardware Description Language (VHDL) and functionally verified by implementing it on Xilinx Spartan-3 XC3S2000FG900-4 FPGA. Implementation results are presented and it shows that the proposed design can operate at a maximum frequency of 86.618 MHz.

Keywords: Cross moments, Cumulants, FPGA, Hardware Implementation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1682
1197 Fully Parameterizable FPGA based Crypto-Accelerator

Authors: Iqbalur Rahman, Miftahur Rahman, Abul L Haque, Mostafizur Rahman,

Abstract:

In this paper, RSA encryption algorithm and its hardware implementation in Xilinx-s Virtex Field Programmable Gate Arrays (FPGA) is analyzed. The issues of scalability, flexible performance, and silicon efficiency for the hardware acceleration of public key crypto systems are being explored in the present work. Using techniques based on the interleaved math for exponentiation, the proposed RSA calculation architecture is compared to existing FPGA-based solutions for speed, FPGA utilization, and scalability. The paper covers the RSA encryption algorithm, interleaved multiplication, Miller Rabin algorithm for primality test, extended Euclidean math, basic FPGA technology, and the implementation details of the proposed RSA calculation architecture. Performance of several alternative hardware architectures is discussed and compared. Finally, conclusion is drawn, highlighting the advantages of a fully flexible & parameterized design.

Keywords: Crypto Accelerator, FPGA, Public Key Cryptography, RSA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2708
1196 Cellular Automata Based Robust Watermarking Architecture towards the VLSI Realization

Authors: V. H. Mankar, T. S. Das, S. K. Sarkar

Abstract:

In this paper, we have proposed a novel blind watermarking architecture towards its hardware implementation in VLSI. In order to facilitate this hardware realization, cellular automata (CA) concept is introduced. The CA has been already accepted as an attractive structure for VLSI implementation because of its modularity, parallelism, high performance and reliability. The hardware realizable multiresolution spread spectrum watermarking techniques are very few in numbers in spite of their best ever resiliency against signal impairments. This is because of the computational cost and complexity associated with their different filter banks and lifting techniques. The concept of cellular automata theory in order to form a new transform domain technique i.e. Cellular Automata Transform (CAT) have been incorporated. Since CA provides spreading sequences having very low cross-correlation properties, the CA based pseudorandom sequence generator is considered in the present work. Considering the watermarking technique as a digital communication process, an error control coding (ECC) must be incorporated in the data hiding schemes. Besides the hardware implementation of entire CA based data hiding technique, the individual blocks of the algorithm using CA provide the best result than that of some other methods irrespective of the hardware and software technique. The Cellular Automata Transform, CA based PN sequence generator, and CA ECC are the requisite blocks that are developed not only to meet the reliable hardware requirements but also for the basic spread spectrum watermarking features. The proposed algorithm shows statistical invisibility and resiliency against various common signal-processing operations. This algorithmic design utilizes the existing allocated bandwidth in the data transmission channel in a more efficient manner.

Keywords: Cellular automata, watermarking, error control coding, PN sequence, VLSI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2023
1195 An Efficient Architecture for Interleaved Modular Multiplication

Authors: Ahmad M. Abdel Fattah, Ayman M. Bahaa El-Din, Hossam M.A. Fahmy

Abstract:

Modular multiplication is the basic operation in most public key cryptosystems, such as RSA, DSA, ECC, and DH key exchange. Unfortunately, very large operands (in order of 1024 or 2048 bits) must be used to provide sufficient security strength. The use of such big numbers dramatically slows down the whole cipher system, especially when running on embedded processors. So far, customized hardware accelerators - developed on FPGAs or ASICs - were the best choice for accelerating modular multiplication in embedded environments. On the other hand, many algorithms have been developed to speed up such operations. Examples are the Montgomery modular multiplication and the interleaved modular multiplication algorithms. Combining both customized hardware with an efficient algorithm is expected to provide a much faster cipher system. This paper introduces an enhanced architecture for computing the modular multiplication of two large numbers X and Y modulo a given modulus M. The proposed design is compared with three previous architectures depending on carry save adders and look up tables. Look up tables should be loaded with a set of pre-computed values. Our proposed architecture uses the same carry save addition, but replaces both look up tables and pre-computations with an enhanced version of sign detection techniques. The proposed architecture supports higher frequencies than other architectures. It also has a better overall absolute time for a single operation.

Keywords: Montgomery multiplication, modular multiplication, efficient architecture, FPGA, RSA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2394
1194 Development of A Meta Description Language for Software/Hardware Cooperative Design and Verification for Model-Checking Systems

Authors: Katsumi Wasaki, Naoki Iwasaki

Abstract:

Model-checking tools such as Symbolic Model Verifier (SMV) and NuSMV are available for checking hardware designs. These tools can automatically check the formal legitimacy of a design. However, NuSMV is too low level for describing a complete hardware design. It is therefore necessary to translate the system definition, as designed in a language such as Verilog or VHDL, into a language such as NuSMV for validation. In this paper, we present a meta hardware description language, Melasy, that contains a code generator for existing hardware description languages (HDLs) and languages for model checking that solve this problem.

Keywords: meta description language, software/hardware codesign, co-verification, formal verification, hardware compiler, modelchecking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1418
1193 On the Design of Electronic Control Unitsfor the Safety-Critical Vehicle Applications

Authors: Kyung-Jung Lee, Hyun-Sik Ahn

Abstract:

This paper suggests a design methodology for the hardware and software of the electronic control unit (ECU) of safety-critical vehicle applications such as braking and steering. The architecture of the hardware is a high integrity system such thatit incorporates a high performance 32-bit CPU and a separate peripheral controlprocessor (PCP) together with an external watchdog CPU. Communication between the main CPU and the PCP is executed via a common area of RAM and events on either processor which are invoked by interrupts. Safety-related software is also implemented to provide a reliable, self-testing computing environment for safety critical and high integrity applications. The validity of the design approach is shown by using the hardware-in-the-loop simulation (HILS)for electric power steering(EPS) systemswhich consists of the EPS mechanism, the designed ECU, and monitoring tools.

Keywords: Electronic control unit, electric power steering, functional safety, hardware-in-the-loop simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3275
1192 A Novel VLSI Architecture for Image Compression Model Using Low power Discrete Cosine Transform

Authors: Vijaya Prakash.A.M, K.S.Gurumurthy

Abstract:

In Image processing the Image compression can improve the performance of the digital systems by reducing the cost and time in image storage and transmission without significant reduction of the Image quality. This paper describes hardware architecture of low complexity Discrete Cosine Transform (DCT) architecture for image compression[6]. In this DCT architecture, common computations are identified and shared to remove redundant computations in DCT matrix operation. Vector processing is a method used for implementation of DCT. This reduction in computational complexity of 2D DCT reduces power consumption. The 2D DCT is performed on 8x8 matrix using two 1-Dimensional Discrete cosine transform blocks and a transposition memory [7]. Inverse discrete cosine transform (IDCT) is performed to obtain the image matrix and reconstruct the original image. The proposed image compression algorithm is comprehended using MATLAB code. The VLSI design of the architecture is implemented Using Verilog HDL. The proposed hardware architecture for image compression employing DCT was synthesized using RTL complier and it was mapped using 180nm standard cells. . The Simulation is done using Modelsim. The simulation results from MATLAB and Verilog HDL are compared. Detailed analysis for power and area was done using RTL compiler from CADENCE. Power consumption of DCT core is reduced to 1.027mW with minimum area[1].

Keywords: Discrete Cosine Transform (DCT), Inverse DiscreteCosine Transform (IDCT), Joint Photographic Expert Group (JPEG), Low Power Design, Very Large Scale Integration (VLSI) .

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3097
1191 An FPGA Implementation of Intelligent Visual Based Fall Detection

Authors: Peng Shen Ong, Yoong Choon Chang, Chee Pun Ooi, Ettikan K. Karuppiah, Shahirina Mohd Tahir

Abstract:

Falling has been one of the major concerns and threats to the independence of the elderly in their daily lives. With the worldwide significant growth of the aging population, it is essential to have a promising solution of fall detection which is able to operate at high accuracy in real-time and supports large scale implementation using multiple cameras. Field Programmable Gate Array (FPGA) is a highly promising tool to be used as a hardware accelerator in many emerging embedded vision based system. Thus, it is the main objective of this paper to present an FPGA-based solution of visual based fall detection to meet stringent real-time requirements with high accuracy. The hardware architecture of visual based fall detection which utilizes the pixel locality to reduce memory accesses is proposed. By exploiting the parallel and pipeline architecture of FPGA, our hardware implementation of visual based fall detection using FGPA is able to achieve a performance of 60fps for a series of video analytical functions at VGA resolutions (640x480). The results of this work show that FPGA has great potentials and impacts in enabling large scale vision system in the future healthcare industry due to its flexibility and scalability.

Keywords: Fall detection, FPGA, hardware implementation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2417
1190 Hardware Approach to Solving Password Exposure Problem through Keyboard Sniff

Authors: Kyungroul Lee, Kwangjin Bae, Kangbin Yim

Abstract:

This paper introduces a hardware solution to password exposure problem caused by direct accesses to the keyboard hardware interfaces through which a possible attacker is able to grab user-s password even where existing countermeasures are deployed. Several researches have proposed reasonable software based solutions to the problem for years. However, recently introduced hardware vulnerability problems have neutralized the software approaches and yet proposed any effective software solution to the vulnerability. Hardware approach in this paper is expected as the only solution to the vulnerability

Keywords: Keyboard sniff, password exposure, hardware vulnerability, privacy problem, insider security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1529
1189 New VLSI Architecture for Motion Estimation Algorithm

Authors: V. S. K. Reddy, S. Sengupta, Y. M. Latha

Abstract:

This paper presents an efficient VLSI architecture design to achieve real time video processing using Full-Search Block Matching (FSBM) algorithm. The design employs parallel bank architecture with minimum latency, maximum throughput, and full hardware utilization. We use nine parallel processors in our architecture and each controlled by a state machine. State machine control implementation makes the design very simple and cost effective. The design is implemented using VHDL and the programming techniques we incorporated makes the design completely programmable in the sense that the search ranges and the block sizes can be varied to suit any given requirements. The design can operate at frequencies up to 36 MHz and it can function in QCIF and CIF video resolution at 1.46 MHz and 5.86 MHz, respectively.

Keywords: Video Coding, Motion Estimation, Full-Search, Block-Matching, VLSI Architecture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1761
1188 An E-Retailing System Architecture Based on Cloud Computing

Authors: Chanchai Supaartagorn

Abstract:

E-retailing is the sale of goods online that takes place over the Internet. The Internet has shrunk the entire World. World eretailing is growing at an exponential rate in the Americas, Europe and Asia. However, e-retailing costs require expensive investment, such as hardware, software, and security systems. Cloud computing technology is internet-based computing for the management and delivery of applications and services. Cloud-based e-retailing application models allow enterprises to lower their costs with their effective implementation of e-retailing activities. In this paper, we describe the concept of cloud computing and present the architecture of cloud computing, combining the features of e-retailing. In addition, we propose a strategy for implementing cloud computing with e-retailing. Finally, we explain the benefits from the architecture.

Keywords: Architecture, cloud computing, e-retailing, internet-based.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3099
1187 Development of Reliable Web-Based Laboratories for Developing Countries

Authors: Teyana S. Sapula, Damian D. Haule

Abstract:

In online context, the design and implementation of effective remote laboratories environment is highly challenging on account of hardware and software needs. This paper presents the remote laboratory software framework modified from ilab shared architecture (ISA). The ISA is a framework which enables students to remotely acccess and control experimental hardware using internet infrastructure. The need for remote laboratories came after experiencing problems imposed by traditional laboratories. Among them are: the high cost of laboratory equipment, scarcity of space, scarcity of technical personnel along with the restricted university budget creates a significant bottleneck on building required laboratory experiments. The solution to these problems is to build web-accessible laboratories. Remote laboratories allow students and educators to interact with real laboratory equipment located anywhere in the world at anytime. Recently, many universities and other educational institutions especially in third world countries rely on simulations because they do not afford the experimental equipment they require to their students. Remote laboratories enable users to get real data from real-time hand-on experiments. To implement many remote laboratories, the system architecture should be flexible, understandable and easy to implement, so that different laboratories with different hardware can be deployed easily. The modifications were made to enable developers to add more equipment in ISA framework and to attract the new developers to develop many online laboratories.

Keywords: Batched, ISA, labserver, servicebroker.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1380
1186 FPGA Implementation of the “PYRAMIDS“ Block Cipher

Authors: A. AlKalbany, H. Al hassan, M. Saeb

Abstract:

The “PYRAMIDS" Block Cipher is a symmetric encryption algorithm of a 64, 128, 256-bit length, that accepts a variable key length of 128, 192, 256 bits. The algorithm is an iterated cipher consisting of repeated applications of a simple round transformation with different operations and different sequence in each round. The algorithm was previously software implemented in Cµ code. In this paper, a hardware implementation of the algorithm, using Field Programmable Gate Arrays (FPGA), is presented. In this work, we discuss the algorithm, the implemented micro-architecture, and the simulation and implementation results. Moreover, we present a detailed comparison with other implemented standard algorithms. In addition, we include the floor plan as well as the circuit diagrams of the various micro-architecture modules.

Keywords: FPGA, VHDL, micro-architecture, encryption, cryptography, algorithm, data communication security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1652
1185 A Four Method Framework for Fighting Software Architecture Erosion

Authors: Sundus Ayyaz, Saad Rehman, Usman Qamar

Abstract:

Software Architecture is the basic structure of software that states the development and advancement of a software system. Software architecture is also considered as a significant tool for the construction of high quality software systems. A clean design leads to the control, value and beauty of software resulting in its longer life while a bad design is the cause of architectural erosion where a software evolution completely fails. This paper discusses the occurrence of software architecture erosion and presents a set of methods for the detection, declaration and prevention of architecture erosion. The causes and symptoms of architecture erosion are observed with the examples of prescriptive and descriptive architectures and the practices used to stop this erosion are also discussed by considering different types of software erosion and their affects. Consequently finding and devising the most suitable approach for fighting software architecture erosion and in some way reducing its affect is evaluated and tested on different scenarios.

Keywords: Software Architecture, Architecture Erosion, Prescriptive Architecture, Descriptive Architecture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2092
1184 A Parallel Implementation of the Reverse Converter for the Moduli Set {2n, 2n–1, 2n–1–1}

Authors: Mehdi Hosseinzadeh, Amir Sabbagh Molahosseini, Keivan Navi

Abstract:

In this paper, a new reverse converter for the moduli set {2n, 2n–1, 2n–1–1} is presented. We improved a previously introduced conversion algorithm for deriving an efficient hardware design for reverse converter. Hardware architecture of the proposed converter is based on carry-save adders and regular binary adders, without the requirement for modular adders. The presented design is faster than the latest introduced reverse converter for moduli set {2n, 2n–1, 2n–1–1}. Also, it has better performance than the reverse converters for the recently introduced moduli set {2n+1–1, 2n, 2n–1}

Keywords: Residue arithmetic, Residue number system, Residue-to-Binary converter, Reverse converter

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1268