Search results for: Single-Stage Karatsuba
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5

Search results for: Single-Stage Karatsuba

5 Efficient Large Numbers Karatsuba-Ofman Multiplier Designs for Embedded Systems

Authors: M.Machhout, M.Zeghid, W.El hadj youssef, B.Bouallegue, A.Baganne, R.Tourki

Abstract:

Long number multiplications (n ≥ 128-bit) are a primitive in most cryptosystems. They can be performed better by using Karatsuba-Ofman technique. This algorithm is easy to parallelize on workstation network and on distributed memory, and it-s known as the practical method of choice. Multiplying long numbers using Karatsuba-Ofman algorithm is fast but is highly recursive. In this paper, we propose different designs of implementing Karatsuba-Ofman multiplier. A mixture of sequential and combinational system design techniques involving pipelining is applied to our proposed designs. Multiplying large numbers can be adapted flexibly to time, area and power criteria. Computationally and occupation constrained in embedded systems such as: smart cards, mobile phones..., multiplication of finite field elements can be achieved more efficiently. The proposed designs are compared to other existing techniques. Mathematical models (Area (n), Delay (n)) of our proposed designs are also elaborated and evaluated on different FPGAs devices.

Keywords: finite field, Karatsuba-Ofman, long numbers, multiplication, mathematical model, recursivity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2468
4 Very Large Scale Integration Architecture of Finite Impulse Response Filter Implementation Using Retiming Technique

Authors: S. Jalaja, A. M. Vijaya Prakash

Abstract:

Recursive combination of an algorithm based on Karatsuba multiplication is exploited to design a generalized transpose and parallel Finite Impulse Response (FIR) Filter. Mid-range Karatsuba multiplication and Carry Save adder based on Karatsuba multiplication reduce time complexity for higher order multiplication implemented up to n-bit. As a result, we design modified N-tap Transpose and Parallel Symmetric FIR Filter Structure using Karatsuba algorithm. The mathematical formulation of the FFA Filter is derived. The proposed architecture involves significantly less area delay product (APD) then the existing block implementation. By adopting retiming technique, hardware cost is reduced further. The filter architecture is designed by using 90 nm technology library and is implemented by using cadence EDA Tool. The synthesized result shows better performance for different word length and block size. The design achieves switching activity reduction and low power consumption by applying with and without retiming for different combination of the circuit. The proposed structure achieves more than a half of the power reduction by adopting with and without retiming techniques compared to the earlier design structure. As a proof of the concept for block size 16 and filter length 64 for CKA method, it achieves a 51% as well as 70% less power by applying retiming technique, and for CSA method it achieves a 57% as well as 77% less power by applying retiming technique compared to the previously proposed design.

Keywords: Carry save adder Karatsuba multiplication, mid-range Karatsuba multiplication, modified FFA, transposed filter, retiming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 846
3 Scalable Systolic Multiplier over Binary Extension Fields Based on Two-Level Karatsuba Decomposition

Authors: Chiou-Yng Lee, Wen-Yo Lee, Chieh-Tsai Wu, Cheng-Chen Yang

Abstract:

Shifted polynomial basis (SPB) is a variation of polynomial basis representation. SPB has potential for efficient bit level and digi -level implementations of multiplication over binary extension fields with subquadratic space complexity. For efficient implementation of pairing computation with large finite fields, this paper presents a new SPB multiplication algorithm based on Karatsuba schemes, and used that to derive a novel scalable multiplier architecture. Analytical results show that the proposed multiplier provides a trade-off between space and time complexities. Our proposed multiplier is modular, regular, and suitable for very large scale integration (VLSI) implementations. It involves less area complexity compared to the multipliers based on traditional decomposition methods. It is therefore, more suitable for efficient hardware implementation of pairing based cryptography and elliptic curve cryptography (ECC) in constraint driven applications.

Keywords: Digit-serial systolic multiplier, elliptic curve cryptography (ECC), Karatsuba algorithm (KA), shifted polynomial basis (SPB), pairing computation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2017
2 Analytical Comparison of Conventional Algorithms with Vedic Algorithm for Digital Multiplier

Authors: Akhilesh G. Naik, Dipankar Pal

Abstract:

In today’s scenario, the complexity of digital signal processing (DSP) applications and various microcontroller architectures have been increasing to such an extent that the traditional approaches to multiplier design in most processors are becoming outdated for being comparatively slow. Modern processing applications require suitable pipelined approaches, and therefore, algorithms that are friendlier with pipelined architectures. Traditional algorithms like Wallace Tree, Radix-4 Booth, Radix-8 Booth, Dadda architectures have been proven to be comparatively slow for pipelined architectures. These architectures, therefore, need to be optimized or combined with other architectures amongst them to enhance its performances and to be made suitable for pipelined hardware/architectures. Recently, Vedic algorithm mathematically has proven to be efficient by appearing to be less complex and with fewer steps for its output establishment and have assumed renewed importance. This paper describes and shows how the Vedic algorithm can be better suited for pipelined architectures and also can be combined with traditional architectures and algorithms for enhancing its ability even further. In this paper, we also established that for complex applications on DSP and other microcontroller architectures, using Vedic approach for multiplication proves to be the best available and efficient option.

Keywords: Wallace tree, Radix-4 Booth, Radix-8 Booth, Dadda, Vedic, Single-Stage Karatsuba, Looped Karatsuba.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 767
1 A Design of Elliptic Curve Cryptography Processor Based on SM2 over GF(p)

Authors: Shiji Hu, Lei Li, Wanting Zhou, Daohong Yang

Abstract:

The data encryption is the foundation of today’s communication. On this basis, to improve the speed of data encryption and decryption is always an important goal for high-speed applications. This paper proposed an elliptic curve crypto processor architecture based on SM2 prime field. Regarding hardware implementation, we optimized the algorithms in different stages of the structure. For modulo operation on finite field, we proposed an optimized improvement of the Karatsuba-Ofman multiplication algorithm and shortened the critical path through the pipeline structure in the algorithm implementation. Based on SM2 recommended prime field, a fast modular reduction algorithm is used to reduce 512-bit data obtained from the multiplication unit. The radix-4 extended Euclidean algorithm was used to realize the conversion between the affine coordinate system and the Jacobi projective coordinate system. In the parallel scheduling point operations on elliptic curves, we proposed a three-level parallel structure of point addition and point double based on the Jacobian projective coordinate system. Combined with the scalar multiplication algorithm, we added mutual pre-operation to the point addition and double point operation to improve the efficiency of the scalar point multiplication. The proposed ECC hardware architecture was verified and implemented on Xilinx Virtex-7 and ZYNQ-7 platforms, and each 256-bit scalar multiplication operation took 0.275ms. The performance for handling scalar multiplication is 32 times that of CPU (dual-core ARM Cortex-A9).

Keywords: Elliptic curve cryptosystems, SM2, modular multiplication, point multiplication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 121