Search results for: modulo 2n+1 adders
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 31

Search results for: modulo 2n+1 adders

31 Improved Modulo 2n +1 Adder Design

Authors: Somayeh Timarchi, Keivan Navi

Abstract:

Efficient modulo 2n+1 adders are important for several applications including residue number system, digital signal processors and cryptography algorithms. In this paper we present a novel modulo 2n+1 addition algorithm for a recently represented number system. The proposed approach is introduced for the reduction of the power dissipated. In a conventional modulo 2n+1 adder, all operands have (n+1)-bit length. To avoid using (n+1)-bit circuits, the diminished-1 and carry save diminished-1 number systems can be effectively used in applications. In the paper, we also derive two new architectures for designing modulo 2n+1 adder, based on n-bit ripple-carry adder. The first architecture is a faster design whereas the second one uses less hardware. In the proposed method, the special treatment required for zero operands in Diminished-1 number system is removed. In the fastest modulo 2n+1 adders in normal binary system, there are 3-operand adders. This problem is also resolved in this paper. The proposed architectures are compared with some efficient adders based on ripple-carry adder and highspeed adder. It is shown that the hardware overhead and power consumption will be reduced. As well as power reduction, in some cases, power-delay product will be also reduced.

Keywords: Modulo 2n+1 arithmetic, residue number system, low power, ripple-carry adders.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2848
30 Efficient Power-Delay Product Modulo 2n+1 Adder Design

Authors: Yavar Safaei Mehrabani, Mehdi Hosseinzadeh

Abstract:

As embedded and portable systems were emerged power consumption of circuits had been major challenge. On the other hand latency as determines frequency of circuits is also vital task. Therefore, trade off between both of them will be desirable. Modulo 2n+1 adders are important part of the residue number system (RNS) based arithmetic units with the interesting moduli set (2n-1,2n, 2n+1). In this manuscript we have introduced novel binary representation to the design of modulo 2n+1 adder. VLSI realization of proposed architecture under 180 nm full static CMOS technology reveals its superiority in terms of area, power consumption and power-delay product (PDP) against several peer existing structures.

Keywords: Computer arithmetic, modulo 2n+1 adders, Residue Number System (RNS), VLSI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1753
29 A 1.2-ns16×16-Bit Binary Multiplier Using High Speed Compressors

Authors: A. Dandapat, S. Ghosal, P. Sarkar, D. Mukhopadhyay

Abstract:

For higher order multiplications, a huge number of adders or compressors are to be used to perform the partial product addition. We have reduced the number of adders by introducing special kind of adders that are capable to add five/six/seven bits per decade. These adders are called compressors. Binary counter property has been merged with the compressor property to develop high order compressors. Uses of these compressors permit the reduction of the vertical critical paths. A 16×16 bit multiplier has been developed using these compressors. These compressors make the multipliers faster as compared to the conventional design that have been used 4-2 compressors and 3-2 compressors.

Keywords: Binary multiplier, Compressors, Counter, Column adder, Low power.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3582
28 Two New Low Power High Performance Full Adders with Minimum Gates

Authors: M.Hosseinghadiry, H. Mohammadi, M.Nadisenejani

Abstract:

with increasing circuits- complexity and demand to use portable devices, power consumption is one of the most important parameters these days. Full adders are the basic block of many circuits. Therefore reducing power consumption in full adders is very important in low power circuits. One of the most powerconsuming modules in full adders is XOR/XNOR circuit. This paper presents two new full adders based on two new logic approaches. The proposed logic approaches use one XOR or XNOR gate to implement a full adder cell. Therefore, delay and power will be decreased. Using two new approaches and two XOR and XNOR gates, two new full adders have been implemented in this paper. Simulations are carried out by HSPICE in 0.18μm bulk technology with 1.8V supply voltage. The results show that the ten-transistors proposed full adder has 12% less power consumption and is 5% faster in comparison to MB12T full adder. 9T is more efficient in area and is 24% better than similar 10T full adder in term of power consumption. The main drawback of the proposed circuits is output threshold loss problem.

Keywords: Full adder, XNOR, Low power, High performance, Very Large Scale Integrated Circuit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2030
27 Group of p-th Roots of Unity Modulo n

Authors: Rochdi Omami, Mohamed Omami, Raouf Ouni

Abstract:

Let n ≥ 3 be an integer and p be a prime odd number. Let us consider Gp(n) the subgroup of (Z/nZ)* defined by : Gp(n) = {x ∈ (Z/nZ)* / xp = 1}. In this paper, we give an algorithm that computes a generating set of this subgroup.

Keywords: Group, p-th roots, modulo, unity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 980
26 Group of Square Roots of Unity Modulo n

Authors: Rochdi Omami, Mohamed Omami, Raouf Ouni

Abstract:

Let n ≥ 3 be an integer and G2(n) be the subgroup of square roots of 1 in (Z/nZ)*. In this paper, we give an algorithm that computes a generating set of this subgroup.

Keywords: Group, modulo, square roots, unity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1893
25 A Parallel Implementation of the Reverse Converter for the Moduli Set {2n, 2n–1, 2n–1–1}

Authors: Mehdi Hosseinzadeh, Amir Sabbagh Molahosseini, Keivan Navi

Abstract:

In this paper, a new reverse converter for the moduli set {2n, 2n–1, 2n–1–1} is presented. We improved a previously introduced conversion algorithm for deriving an efficient hardware design for reverse converter. Hardware architecture of the proposed converter is based on carry-save adders and regular binary adders, without the requirement for modular adders. The presented design is faster than the latest introduced reverse converter for moduli set {2n, 2n–1, 2n–1–1}. Also, it has better performance than the reverse converters for the recently introduced moduli set {2n+1–1, 2n, 2n–1}

Keywords: Residue arithmetic, Residue number system, Residue-to-Binary converter, Reverse converter

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1268
24 Classification of the Bachet Elliptic Curves y2 = x3 + a3 in Fp, where p ≡ 1 (mod 6) is Prime

Authors: Nazli Yildiz İkikardes, Gokhan Soydan, Musa Demirci, Ismail Naci Cangul

Abstract:

In this work, we first give in what fields Fp, the cubic root of unity lies in F*p, in Qp and in K*p where Qp and K*p denote the sets of quadratic and non-zero cubic residues modulo p. Then we use these to obtain some results on the classification of the Bachet elliptic curves y2 ≡ x3 +a3 modulo p, for p ≡ 1 (mod 6) is prime.

Keywords: Elliptic curves over finite fields, quadratic residue, cubic residue.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1791
23 Implementation of Adder-Subtracter Design with VerilogHDL

Authors: May Phyo Thwal, Khin Htay Kyi, Kyaw Swar Soe

Abstract:

According to the density of the chips, designers are trying to put so any facilities of computational and storage on single chips. Along with the complexity of computational and storage circuits, the designing, testing and debugging become more and more complex and expensive. So, hardware design will be built by using very high speed hardware description language, which is more efficient and cost effective. This paper will focus on the implementation of 32-bit ALU design based on Verilog hardware description language. Adder and subtracter operate correctly on both unsigned and positive numbers. In ALU, addition takes most of the time if it uses the ripple-carry adder. The general strategy for designing fast adders is to reduce the time required to form carry signals. Adders that use this principle are called carry look- ahead adder. The carry look-ahead adder is to be designed with combination of 4-bit adders. The syntax of Verilog HDL is similar to the C programming language. This paper proposes a unified approach to ALU design in which both simulation and formal verification can co-exist.

Keywords: Addition, arithmetic logic unit, carry look-ahead adder, Verilog HDL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8875
22 The Number of Rational Points on Elliptic Curves y2 = x3 + a3 on Finite Fields

Authors: Musa Demirci, Nazlı Yıldız İkikardeş, Gökhan Soydan, İsmail Naci Cangül

Abstract:

In this work, we consider the rational points on elliptic curves over finite fields Fp. We give results concerning the number of points Np,a on the elliptic curve y2 ≡ x3 +a3(mod p) according to whether a and x are quadratic residues or non-residues. We use two lemmas to prove the main results first of which gives the list of primes for which -1 is a quadratic residue, and the second is a result from [1]. We get the results in the case where p is a prime congruent to 5 modulo 6, while when p is a prime congruent to 1 modulo 6, there seems to be no regularity for Np,a.

Keywords: Elliptic curves over finite fields, rational points, quadratic residue.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2346
21 A New Efficient Scalable BIST Full Adder using Polymorphic Gates

Authors: M. Mashayekhi, H. H. Ardakani, A. Omidian

Abstract:

Among various testing methodologies, Built-in Self- Test (BIST) is recognized as a low cost, effective paradigm. Also, full adders are one of the basic building blocks of most arithmetic circuits in all processing units. In this paper, an optimized testable 2- bit full adder as a test building block is proposed. Then, a BIST procedure is introduced to scale up the building block and to generate a self testable n-bit full adders. The target design can achieve 100% fault coverage using insignificant amount of hardware redundancy. Moreover, Overall test time is reduced by utilizing polymorphic gates and also by testing full adder building blocks in parallel.

Keywords: BIST, Full Adder, Polymorphic Gate

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1725
20 Equalities in a Variety of Multiple Algebras

Authors: Mona Taheri

Abstract:

The purpose of this research is to study the concepts of multiple Cartesian product, variety of multiple algebras and to present some examples. In the theory of multiple algebras, like other theories, deriving new things and concepts from the things and concepts available in the context is important. For example, the first were obtained from the quotient of a group modulo the equivalence relation defined by a subgroup of it. Gratzer showed that every multiple algebra can be obtained from the quotient of a universal algebra modulo a given equivalence relation. The purpose of this study is examination of multiple algebras and basic relations defined on them as well as introduction to some algebraic structures derived from multiple algebras. Among the structures obtained from multiple algebras, this article studies submultiple algebras, quotients of multiple algebras and the Cartesian product of multiple algebras.

Keywords: hypergroup, multiple algebras

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1317
19 An Efficient Architecture for Interleaved Modular Multiplication

Authors: Ahmad M. Abdel Fattah, Ayman M. Bahaa El-Din, Hossam M.A. Fahmy

Abstract:

Modular multiplication is the basic operation in most public key cryptosystems, such as RSA, DSA, ECC, and DH key exchange. Unfortunately, very large operands (in order of 1024 or 2048 bits) must be used to provide sufficient security strength. The use of such big numbers dramatically slows down the whole cipher system, especially when running on embedded processors. So far, customized hardware accelerators - developed on FPGAs or ASICs - were the best choice for accelerating modular multiplication in embedded environments. On the other hand, many algorithms have been developed to speed up such operations. Examples are the Montgomery modular multiplication and the interleaved modular multiplication algorithms. Combining both customized hardware with an efficient algorithm is expected to provide a much faster cipher system. This paper introduces an enhanced architecture for computing the modular multiplication of two large numbers X and Y modulo a given modulus M. The proposed design is compared with three previous architectures depending on carry save adders and look up tables. Look up tables should be loaded with a set of pre-computed values. Our proposed architecture uses the same carry save addition, but replaces both look up tables and pre-computations with an enhanced version of sign detection techniques. The proposed architecture supports higher frequencies than other architectures. It also has a better overall absolute time for a single operation.

Keywords: Montgomery multiplication, modular multiplication, efficient architecture, FPGA, RSA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2394
18 A New Efficient RNS Reverse Converter for the 4-Moduli Set 

Authors: Edem K. Bankas, Kazeem A. Gbolagade

Abstract:

In this paper, we propose a new efficient reverse converter for the 4-moduli set {2n, 2n + 1, 2n 1, 22n+1 1} based on a modified Chinese Remainder Theorem and Mixed Radix Conversion. Additionally, the resulting architecture is further reduced to obtain a reverse converter that utilizes only carry save adders, a multiplexer and carry propagate adders. The proposed converter has an area cost of (12n + 2) FAs and (5n + 1) HAs with a delay of (9n + 6)tFA + tMUX. When compared with state of the art, our proposal demonstrates to be faster, at the expense of slightly more hardware resources. Further, the Area-Time square metric was computed which indicated that our proposed scheme outperforms the state of the art reverse converter.

Keywords: Modified Chinese Remainder Theorem, Mixed Radix Conversion, Reverse Converter, Carry Save Adder, Carry Propagate Adder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2259
17 A Formal Property Verification for Aspect-Oriented Programs in Software Development

Authors: Moustapha Bande, Hakima Ould-Slimane, Hanifa Boucheneb

Abstract:

Software development for complex systems requires efficient and automatic tools that can be used to verify the satisfiability of some critical properties such as security ones. With the emergence of Aspect-Oriented Programming (AOP), considerable work has been done in order to better modularize the separation of concerns in the software design and implementation. The goal is to prevent the cross-cutting concerns to be scattered across the multiple modules of the program and tangled with other modules. One of the key challenges in the aspect-oriented programs is to be sure that all the pieces put together at the weaving time ensure the satisfiability of the overall system requirements. Our paper focuses on this problem and proposes a formal property verification approach for a given property from the woven program. The approach is based on the control flow graph (CFG) of the woven program, and the use of a satisfiability modulo theories (SMT) solver to check whether each property (represented par one aspect) is satisfied or not once the weaving is done.

Keywords: Aspect-oriented programming, control flow graph, satisfiability modulo theories, property verification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 687
16 A Novel Low Power, High Speed 14 Transistor CMOS Full Adder Cell with 50% Improvement in Threshold Loss Problem

Authors: T. Vigneswaran, B. Mukundhan, P. Subbarami Reddy

Abstract:

Full adders are important components in applications such as digital signal processors (DSP) architectures and microprocessors. In addition to its main task, which is adding two numbers, it participates in many other useful operations such as subtraction, multiplication, division,, address calculation,..etc. In most of these systems the adder lies in the critical path that determines the overall speed of the system. So enhancing the performance of the 1-bit full adder cell (the building block of the adder) is a significant goal.Demands for the low power VLSI have been pushing the development of aggressive design methodologies to reduce the power consumption drastically. To meet the growing demand, we propose a new low power adder cell by sacrificing the MOS Transistor count that reduces the serious threshold loss problem, considerably increases the speed and decreases the power when compared to the static energy recovery full (SERF) adder. So a new improved 14T CMOS l-bit full adder cell is presented in this paper. Results show 50% improvement in threshold loss problem, 45% improvement in speed and considerable power consumption over the SERF adder and other different types of adders with comparable performance.

Keywords: Arithmetic circuit, full adder, multiplier, low power, very Large-scale integration (VLSI).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3904
15 Assessing the Relation between Theory of Multiple Algebras and Universal Algebras

Authors: Mona Taheri

Abstract:

In this study, we examine multiple algebras and algebraic structures derived from them and by stating a theory on multiple algebras; we will show that the theory of multiple algebras is a natural extension of the theory of universal algebras. Also, we will treat equivalence relations on multiple algebras, for which the quotient constructed modulo them is a universal algebra and will study the basic relation and the fundamental algebra in question. In this study, by stating the characteristic theorem of multiple algebras, we show that the theory of multiple algebras is a natural extension of the theory of universal algebras.

Keywords: multiple algebras , universal algebras

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1140
14 Design of Low-Area HEVC Core Transform Architecture

Authors: Seung-Mok Han, Woo-Jin Nam, Seongsoo Lee

Abstract:

This paper proposes and implements an core transform architecture, which is one of the major processes in HEVC video compression standard. The proposed core transform architecture is implemented with only adders and shifters instead of area-consuming multipliers. Shifters in the proposed core transform architecture are implemented in wires and multiplexers, which significantly reduces chip area. Also, it can process from 4×4 to 16×16 blocks with common hardware by reusing processing elements. Designed core transform architecture in 0.13um technology can process a 16×16 block with 2-D transform in 130 cycles, and its gate count is 101,015 gates.

Keywords: HEVC, Core transform, Low area, Shift-and-add, PE reuse

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1872
13 Public Key Cryptosystem based on Number Theoretic Transforms

Authors: C. Porkodi, R. Arumuganathan

Abstract:

In this paper a Public Key Cryptosystem is proposed using the number theoretic transforms (NTT) over a ring of integer modulo a composite number. The key agreement is similar to ElGamal public key algorithm. The security of the system is based on solution of multivariate linear congruence equations and discrete logarithm problem. In the proposed cryptosystem only fixed numbers of multiplications are carried out (constant complexity) and hence the encryption and decryption can be done easily. At the same time, it is very difficult to attack the cryptosystem, since the cipher text is a sequence of integers which are interrelated. The system provides authentication also. Using Mathematica version 5.0 the proposed algorithm is justified with a numerical example.

Keywords: Cryptography, decryption, discrete logarithm problem encryption, Integer Factorization problem, Key agreement, Number Theoretic Transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1633
12 Efficient Hardware Architecture of the Direct 2- D Transform for the HEVC Standard

Authors: Fatma Belghith, Hassen Loukil, Nouri Masmoudi

Abstract:

This paper presents the hardware design of a unified architecture to compute the 4x4, 8x8 and 16x16 efficient twodimensional (2-D) transform for the HEVC standard. This architecture is based on fast integer transform algorithms. It is designed only with adders and shifts in order to reduce the hardware cost significantly. The goal is to ensure the maximum circuit reuse during the computing while saving 40% for the number of operations. The architecture is developed using FIFOs to compute the second dimension. The proposed hardware was implemented in VHDL. The VHDL RTL code works at 240 MHZ in an Altera Stratix III FPGA. The number of cycles in this architecture varies from 33 in 4-point- 2D-DCT to 172 when the 16-point-2D-DCT is computed. Results show frequency improvements reaching 96% when compared to an architecture described as the direct transcription of the algorithm.

Keywords: HEVC, Modified Integer Transform, FPGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2697
11 High Speed NP-CMOS and Multi-Output Dynamic Full Adder Cells

Authors: Reza Faghih Mirzaee, Mohammad Hossein Moaiyeri, Keivan Navi

Abstract:

In this paper we present two novel 1-bit full adder cells in dynamic logic style. NP-CMOS (Zipper) and Multi-Output structures are used to design the adder blocks. Characteristic of dynamic logic leads to higher speeds than the other standard static full adder cells. Using HSpice and 0.18┬Ám CMOS technology exhibits a significant decrease in the cell delay which can result in a considerable reduction in the power-delay product (PDP). The PDP of Multi-Output design at 1.8v power supply is around 0.15 femto joule that is 5% lower than conventional dynamic full adder cell and at least 21% lower than other static full adders.

Keywords: Bridge Style, Dynamic Logic, Full Adder, HighSpeed, Multi Output, NP-CMOS, Zipper.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3207
10 Application of Hardware Efficient CIC Compensation Filter in Narrow Band Filtering

Authors: Vishal Awasthi, Krishna Raj

Abstract:

In many communication and signal processing systems, it is highly desirable to implement an efficient narrow-band filter that decimate or interpolate the incoming signals. This paper presents hardware efficient compensated CIC filter over a narrow band frequency that increases the speed of down sampling by using multiplierless decimation filters with polyphase FIR filter structure. The proposed work analyzed the performance of compensated CIC filter on the bases of the improvement of frequency response with reduced hardware complexity in terms of no. of adders and multipliers and produces the filtered results without any alterations. CIC compensator filter demonstrated that by using compensation with CIC filter improve the frequency response in passed of interest 26.57% with the reduction in hardware complexity 12.25% multiplications per input sample (MPIS) and 23.4% additions per input sample (APIS) w.r.t. FIR filter respectively.

Keywords: Multirate filtering, Narrow-band Signaling, Compensation Theory, CIC filter, Decimation, Compensation filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2893
9 Minimizing Mutant Sets by Equivalence and Subsumption

Authors: Samia Alblwi, Amani Ayad

Abstract:

Mutation testing is the art of generating syntactic variations of a base program and checking whether a candidate test suite can identify all the mutants that are not semantically equivalent to the base; this technique can be used to assess the quality of test suite. One of the main obstacles to the widespread use of mutation testing is cost, as even small programs (a few dozen lines of code) can give rise to a large number of mutants (up to hundreds); this has created an incentive to seek to reduce the number of mutants while preserving their collective effectiveness. Two criteria have been used to reduce the size of mutant sets: equivalence, which aims to partition the set of mutants into equivalence classes modulo semantic equivalence, and selecting one representative per class; and, subsumption, which aims to define a partial ordering among mutants that ranks mutants by effectiveness and seeks to select maximal elements in this ordering. In this paper, we analyze these two policies using analytical and empirical criteria.

Keywords: Mutation testing, mutant sets, mutant equivalence, mutant subsumption, mutant set minimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 87
8 An Efficient Implementation of High Speed Vedic Multiplier Using Compressors for Image Processing Applications

Authors: Shobha Sharma, Amita Dev, Akanksha Kant

Abstract:

Digital signal processor, image signal processor and FIR filters have multipliers as an important part of their design. On the basis of Vedic mathematics, Vedic multipliers have come out to be very fast multipliers. One of the image processing applications is edge detection. This research presents a small area and high speed 8 bit Vedic multiplier system comprising of compressor based adders. This results in faster edge detection. This architecture is tested on Xilinx vertex 4 FPGA board and simulations were carried out using the Xilinx synthesis tool. Comparisons are made and this system is found to be smaller in area with high speed (the lesser propagation delay). This compressor based Vedic multiplier is 1.1 times speedier than a typical Vedic multiplier. Also, this Vedic Multiplier is 2 times speedier than a ‘simple’ multiplier.

Keywords: Detection of edges, Vedic multiplier, image processing, Urdhva Tiryakbhyam sutra.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1764
7 Hybrid Prefix Adder Architecture for Minimizing the Power Delay Product

Authors: P.Ramanathan, P.T.Vanathi

Abstract:

Parallel Prefix addition is a technique for improving the speed of binary addition. Due to continuing integrating intensity and the growing needs of portable devices, low-power and highperformance designs are of prime importance. The classical parallel prefix adder structures presented in the literature over the years optimize for logic depth, area, fan-out and interconnect count of logic circuits. In this paper, a new architecture for performing 8-bit, 16-bit and 32-bit Parallel Prefix addition is proposed. The proposed prefix adder structures is compared with several classical adders of same bit width in terms of power, delay and number of computational nodes. The results reveal that the proposed structures have the least power delay product when compared with its peer existing Prefix adder structures. Tanner EDA tool was used for simulating the adder designs in the TSMC 180 nm and TSMC 130 nm technologies.

Keywords: Parallel Prefix Adder (PPA), Dot operator, Semi-Dotoperator, Complementary Metal Oxide Semiconductor (CMOS), Odd-dot operator, Even-dot operator, Odd-semi-dot operator andEven-semi-dot operator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1673
6 Optimization of SAD Algorithm on VLIW DSP

Authors: Hui-Jae You, Sun-Tae Chung, Souhwan Jung

Abstract:

SAD (Sum of Absolute Difference) algorithm is heavily used in motion estimation which is computationally highly demanding process in motion picture encoding. To enhance the performance of motion picture encoding on a VLIW processor, an efficient implementation of SAD algorithm on the VLIW processor is essential. SAD algorithm is programmed as a nested loop with a conditional branch. In VLIW processors, loop is usually optimized by software pipelining, but researches on optimal scheduling of software pipelining for nested loops, especially nested loops with conditional branches are rare. In this paper, we propose an optimal scheduling and implementation of SAD algorithm with conditional branch on a VLIW DSP processor. The proposed optimal scheduling first transforms the nested loop with conditional branch into a single loop with conditional branch with consideration of full utilization of ILP capability of the VLIW processor and realization of earlier escape from the loop. Next, the proposed optimal scheduling applies a modulo scheduling technique developed for single loop. Based on this optimal scheduling strategy, optimal implementation of SAD algorithm on TMS320C67x, a VLIW DSP is presented. Through experiments on TMS320C6713 DSK, it is shown that H.263 encoder with the proposed SAD implementation performs better than other H.263 encoder with other SAD implementations, and that the code size of the optimal SAD implementation is small enough to be appropriate for embedded environments.

Keywords: Optimal implementation, SAD algorithm, VLIW, TMS320C6713.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2289
5 A New Approach to Design an Efficient CIC Decimator Using Signed Digit Arithmetic

Authors: Vishal Awasthi, Krishna Raj

Abstract:

Any digital processing performed on a signal with larger nyquist interval requires more computation than signal processing performed on smaller nyquist interval. The sampling rate alteration generates the unwanted effects in the system such as spectral aliasing and spectral imaging during signal processing. Multirate-multistage implementation of digital filter can result a significant computational saving than single rate filter designed for sample rate conversion. In this paper, we presented an efficient cascaded integrator comb (CIC) decimation filter that perform fast down sampling using signed digit adder algorithm with compensated frequency droop that arises due to aliasing effect during the decimation process. This proposed compensated CIC decimation filter structure with a hybrid signed digit (HSD) fast adder provide an improved performance in terms of down sampling speed by 65.15% than ripple carry adder (RCA) and reduced area and power by 57.5% and 0.01 % than signed digit (SD) adder algorithms respectively.

Keywords: Sampling rate conversion, Multirate Filtering, Compensation Theory, Decimation filter, CIC filter, Redundant signed digit arithmetic, Fast adders.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4845
4 An Algorithm Proposed for FIR Filter Coefficients Representation

Authors: Mohamed Al Mahdi Eshtawie, Masuri Bin Othman

Abstract:

Finite impulse response (FIR) filters have the advantage of linear phase, guaranteed stability, fewer finite precision errors, and efficient implementation. In contrast, they have a major disadvantage of high order need (more coefficients) than IIR counterpart with comparable performance. The high order demand imposes more hardware requirements, arithmetic operations, area usage, and power consumption when designing and fabricating the filter. Therefore, minimizing or reducing these parameters, is a major goal or target in digital filter design task. This paper presents an algorithm proposed for modifying values and the number of non-zero coefficients used to represent the FIR digital pulse shaping filter response. With this algorithm, the FIR filter frequency and phase response can be represented with a minimum number of non-zero coefficients. Therefore, reducing the arithmetic complexity needed to get the filter output. Consequently, the system characteristic i.e. power consumption, area usage, and processing time are also reduced. The proposed algorithm is more powerful when integrated with multiplierless algorithms such as distributed arithmetic (DA) in designing high order digital FIR filters. Here the DA usage eliminates the need for multipliers when implementing the multiply and accumulate unit (MAC) and the proposed algorithm will reduce the number of adders and addition operations needed through the minimization of the non-zero values coefficients to get the filter output.

Keywords: Pulse shaping Filter, Distributed Arithmetic, Optimization algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3124
3 A Novel Multiple Valued Logic OHRNS Modulo rn Adder Circuit

Authors: Mehdi Hosseinzadeh, Somayyeh Jafarali Jassbi, Keivan Navi

Abstract:

Residue Number System (RNS) is a modular representation and is proved to be an instrumental tool in many digital signal processing (DSP) applications which require high-speed computations. RNS is an integer and non weighted number system; it can support parallel, carry-free, high-speed and low power arithmetic. A very interesting correspondence exists between the concepts of Multiple Valued Logic (MVL) and Residue Number Arithmetic. If the number of levels used to represent MVL signals is chosen to be consistent with the moduli which create the finite rings in the RNS, MVL becomes a very natural representation for the RNS. There are two concerns related to the application of this Number System: reaching the most possible speed and the largest dynamic range. There is a conflict when one wants to resolve both these problem. That is augmenting the dynamic range results in reducing the speed in the same time. For achieving the most performance a method is considere named “One-Hot Residue Number System" in this implementation the propagation is only equal to one transistor delay. The problem with this method is the huge increase in the number of transistors they are increased in order m2 . In real application this is practically impossible. In this paper combining the Multiple Valued Logic and One-Hot Residue Number System we represent a new method to resolve both of these two problems. In this paper we represent a novel design of an OHRNS-based adder circuit. This circuit is useable for Multiple Valued Logic moduli, in comparison to other RNS design; this circuit has considerably improved the number of transistors and power consumption.

Keywords: Computer Arithmetic, Residue Number System, Multiple Valued Logic, One-Hot, VLSI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1797
2 A Design of Elliptic Curve Cryptography Processor Based on SM2 over GF(p)

Authors: Shiji Hu, Lei Li, Wanting Zhou, Daohong Yang

Abstract:

The data encryption is the foundation of today’s communication. On this basis, to improve the speed of data encryption and decryption is always an important goal for high-speed applications. This paper proposed an elliptic curve crypto processor architecture based on SM2 prime field. Regarding hardware implementation, we optimized the algorithms in different stages of the structure. For modulo operation on finite field, we proposed an optimized improvement of the Karatsuba-Ofman multiplication algorithm and shortened the critical path through the pipeline structure in the algorithm implementation. Based on SM2 recommended prime field, a fast modular reduction algorithm is used to reduce 512-bit data obtained from the multiplication unit. The radix-4 extended Euclidean algorithm was used to realize the conversion between the affine coordinate system and the Jacobi projective coordinate system. In the parallel scheduling point operations on elliptic curves, we proposed a three-level parallel structure of point addition and point double based on the Jacobian projective coordinate system. Combined with the scalar multiplication algorithm, we added mutual pre-operation to the point addition and double point operation to improve the efficiency of the scalar point multiplication. The proposed ECC hardware architecture was verified and implemented on Xilinx Virtex-7 and ZYNQ-7 platforms, and each 256-bit scalar multiplication operation took 0.275ms. The performance for handling scalar multiplication is 32 times that of CPU (dual-core ARM Cortex-A9).

Keywords: Elliptic curve cryptosystems, SM2, modular multiplication, point multiplication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 139