Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 31106
An Efficient Architecture for Interleaved Modular Multiplication

Authors: Ahmad M. Abdel Fattah, Ayman M. Bahaa El-Din, Hossam M.A. Fahmy


Modular multiplication is the basic operation in most public key cryptosystems, such as RSA, DSA, ECC, and DH key exchange. Unfortunately, very large operands (in order of 1024 or 2048 bits) must be used to provide sufficient security strength. The use of such big numbers dramatically slows down the whole cipher system, especially when running on embedded processors. So far, customized hardware accelerators - developed on FPGAs or ASICs - were the best choice for accelerating modular multiplication in embedded environments. On the other hand, many algorithms have been developed to speed up such operations. Examples are the Montgomery modular multiplication and the interleaved modular multiplication algorithms. Combining both customized hardware with an efficient algorithm is expected to provide a much faster cipher system. This paper introduces an enhanced architecture for computing the modular multiplication of two large numbers X and Y modulo a given modulus M. The proposed design is compared with three previous architectures depending on carry save adders and look up tables. Look up tables should be loaded with a set of pre-computed values. Our proposed architecture uses the same carry save addition, but replaces both look up tables and pre-computations with an enhanced version of sign detection techniques. The proposed architecture supports higher frequencies than other architectures. It also has a better overall absolute time for a single operation.

Keywords: FPGA, RSA, modular multiplication, Montgomery multiplication, efficient architecture

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2089


[1] Peter L. Montgomery, "Modular multiplication without trial division," in Mathematics of Computation. April 1985, vol. 44, pp. 519-521, American Mathematical Society.
[2] G. R. Blakley, "A computer algorithm for the product AB modulo M," IEEE Transactions on Computers, pp. 497 - 500, May 1983.
[3] D. Narh Amanor, C. Paar, J. Pelzl, V. Bunimov, and M. Schimmler, "Efficient hardware architectures for modular multiplication on FPGAs," International Conference on Field Programmable Logic and Applications, pp. 539-542, 2005.
[4] V. Bunimov and M. Schimmler, "Area and time efficient modular multiplication of large integers," in IEEE 14th International Conference on Application specific Systems, Architectures and Processors, June 2003.
[5] Q.K. Kop and C.Y. Hung, "Fast algorithm for modular reduction," in IEE Proceedings, Computers and Digital Techniques, July 1998, vol. 145, pp. 265-271.
[6] David Narh Amanor, "Efficient hardware architectures for modular multiplication," M.S. thesis, University of Applied Sciences Offenburg, Germany, February 2005.