A Low-Latency Quadratic Extended Domain Modular Multiplier for Bilinear Pairing Based on NLP Multiplication
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33297
A Low-Latency Quadratic Extended Domain Modular Multiplier for Bilinear Pairing Based on NLP Multiplication

Authors: Yulong Jia, Xiang Zhang, Ziyuan Wu, Shiji Hu

Abstract:

The calculation of bilinear pairing is the core of the SM9 algorithm, which relies on the underlying prime domain algorithm and the quadratic extension domain algorithm. Among the field algorithms, modular multiplication operation is the most time-consuming part. Therefore, the underlying modular multiplication algorithm is optimized to maximize the operation speed of bilinear pairings. This paper uses a modular multiplication method based on non-least positive (NLP) combined with Karatsuba and schoolbook multiplication to improve the Montgomery algorithm. At the same time, according to the characteristics of multiplication operation in quadratic extension domain, a quadratic extension domain FP2-NLP modular multiplication algorithm for bilinear pairings is proposed, which effectively reduces the operation time of modular multiplication in quadratic extension domain. The sub-expanded domain Fp2-NLP modular multiplication algorithm effectively reduces the operation time of modular multiplication under the second-expanded domain; The multiplication unit in the quadratic extension domain is implemented using SMIC55nm process, and two different implementation architectures are designed to cope with different application scenarios. Compared with the existing related literature, The output latency of this design can reach a minimum of 15 cycles. The shortest time for calculating the (AB+CD)r−1 mod M form is 37.5ns, and the comprehensive area-time product (AT) is 11400, The final R-ate pairing algorithm hardware accelerator consumes 2670k equivalent logic gates and 1.8ms computing time in 55nm process.

Keywords: sm9, hardware, NLP, Montgomery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19

References:


[1] A. Shamir, “Identity-based cryptosystems and signature schemes,” in Advances in Cryptology: Proceedings of CRYPTO 84 4. Springer, 1985, pp. 47–53.
[2] F. Hess, N. P. Smart, and F. Vercauteren, “The eta pairing revisited,” IEEE transactions on information theory, vol. 52, no. 10, pp. 4595–4602, 2006.
[3] E. Lee, H.-S. Lee, and C.-M. Park, “Efficient and generalized pairing computation on abelian varieties,” IEEE Transactions on Information Theory, vol. 55, no. 4, pp. 1793–1803, 2009.
[4] F. Vercauteren, “Optimal pairings,” IEEE transactions on information theory, vol. 56, no. 1, pp. 455–461, 2009.
[5] D. F. Aranha, K. Karabina, P. Longa, C. H. Gebotys, and J. L´opez, “Faster explicit formulas for computing pairings over ordinary curves,” in Advances in Cryptology–EUROCRYPT 2011: 30th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Tallinn, Estonia, May 15-19, 2011. Proceedings 30. Springer, 2011, pp. 48–68.
[6] R. C. Cheung, S. Duquesne, J. Fan, N. Guillermin, I. Verbauwhede, and G. X. Yao, “Fpga implementation of pairings using residue number system and lazy reduction,” in International Workshop on Cryptographic Hardware and Embedded Systems. Springer, 2011, pp. 421–441.
[7] P. L. Montgomery, “Modular multiplication without trial division,” Mathematics of computation, vol. 44, no. 170, pp. 519–521, 1985.
[8] C. D. Walter, “Montgomery exponentiation needs no final subtractions,” Electronics letters, vol. 35, no. 21, pp. 1831–1832, 1999.
[9] A. F. Tenca and C. K. Koc¸, “A scalable architecture for montgomery nultiplication,” in Cryptographic Hardware and Embedded Systems: First InternationalWorkshop, CHES’99 Worcester, MA, USA, August 12–13, 1999 Proceedings 1. Springer, 1999, pp. 94–108.
[10] A. A. Karatsuba and Y. P. Ofman, “Multiplication of many-digital numbers by automatic computers,” in Doklady Akademii Nauk, vol. 145, no. 2. Russian Academy of Sciences, 1962, pp. 293–294.
[11] A. Rezai and P. Keshavarzi, “High-throughput modular multiplication and exponentiation algorithms using multibit-scan–multibit-shift technique,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 23, no. 9, pp. 1710–1719, 2014.
[12] S.-R. Kuang, K.-Y. Wu, and R.-Y. Lu, “Low-cost high-performance vlsi architecture for montgomery modular multiplication,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 24, no. 2, pp. 434–443, 2015.
[13] S. S. Erdem, T. Yanık, and A. C¸ elebi, “A general digit-serial architecture for montgomery modular multiplication,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 25, no. 5, pp. 1658–1668, 2017.
[14] A. Rezai and P. Keshavarzi, “High-performance scalable architecture for modular multiplication using a new digit-serial computation,” Microelectronics journal, vol. 55, pp. 169–178, 2016.
[15] Y. Li, J. Han, S.Wang, D. Fang, and X. Zeng, “An 800mhz cryptographic pairing processor in 65nm cmos,” in 2012 IEEE Asian Solid State Circuits Conference (A-SSCC). IEEE, 2012, pp. 217–220.
[16] J. Han, Y. Li, Z. Yu, and X. Zeng, “A 65 nm cryptographic processor for high speed pairing computation,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 23, no. 4, pp. 692–701, 2014.
[17] Y. Pang, Y. Zhang, J. Han, and X. Zeng, “F p 2 arithmetic acceleration based on modified barrett modular multiplication algorithm,” in 2017 IEEE 12th International Conference on ASIC (ASICON). IEEE, 2017, pp. 561–564.
[18] A. Miyamoto, N. Homma, T. Aoki, and A. Satoh, “Systematic design of rsa processors based on high-radix montgomery multipliers,” IEEE Transactions on very large scale integration (VLSI) Systems, vol. 19, no. 7, pp. 1136–1146, 2010.
[19] C. K. Koc, T. Acar, and B. S. Kaliski, “Analyzing and comparing montgomery multiplication algorithms,” IEEE micro, vol. 16, no. 3, pp. 26–33, 1996.
[20] A. T. Wang, B. W. Guo, and C. J. Wei, “Highly-parallel hardware implementation of optimal ate pairing over barreto-naehrig curves,” Integration, vol. 64, pp. 13–21, 2019.
[21] Z. Hao, W. Guo, J. Wei, and D. Sun, “Dual processing engine architecture to speed up optimal ate pairing on fpga platform,” in 2016 IEEE Trustcom/BigDataSE/ISPA. IEEE, 2016, pp. 584–589.
[22] D. D. Chen, G. X. Yao, R. C. Cheung, D. Pao, and C. K. Koc¸, “Parameter space for the architecture of fft-based montgomery modular multiplication,” IEEE Transactions on Computers, vol. 65, no. 1, pp. 147–160, 2015.
[23] J. Ding and S. Li, “A low-latency and low-cost montgomery modular multiplier based on nlp multiplication,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 67, no. 7, pp. 1319–1323, 2019.
[24] X. Dong, M. Gao, X. Ma, C. Xiao, and L. Zhang, “An implementation of r-ate pairing based on fpga,” in 2024 3rd International Conference on Big Data, Information and Computer Network (BDICN), 2024, pp. 167–173.
[25] S. Ghosh, D. Mukhopadhyay, and D. Roychowdhury, “High speed flexible pairing cryptoprocessor on fpga platform,” in Pairing-Based Cryptography - Pairing 2010, M. Joye, A. Miyaji, and A. Otsuka, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 450–466.
[26] J. Fan, F. Vercauteren, and I. Verbauwhede, “Faster Fp-arithmetic for cryptographic pairings on barreto-naehrig curves,” in Cryptographic Hardware and Embedded Systems - CHES 2009, C. Clavier and K. Gaj, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009, pp. 240–253.
[27] D. Kammler, D. Zhang, P. Schwabe, H. Scharwaechter, M. Langenberg, D. Auras, G. Ascheid, and R. Mathar, “Designing an asip for cryptographic pairings over barreto-naehrig curves,” in Cryptographic Hardware and Embedded Systems - CHES 2009, C. Clavier and K. Gaj, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009, pp. 254–271.