A New Efficient RNS Reverse Converter for the 4-Moduli Set
\( \{2^n, 2^n+1, 2^n-1, 2^{2n+1}-1\} \)

Edem K. Bankas, Kazeem A. Gbolagade

Abstract—In this paper, we propose a new efficient reverse converter for the 4-moduli set \( \{2^n, 2^n+1, 2^n-1, 2^{2n+1}-1\} \) based on a modified Chinese Remainder Theorem and Mixed Radix Conversion. Additionally, the resulting architecture is further reduced to obtain a reverse converter that utilizes only carry save adders, a multiplexer and carry propagate adders. The proposed converter has an area cost of \((12n+2)\) FAs and \((5n+1)\) HAs with a delay of \((9n+6)\)\(t_{FA}+t_{MUX}\). When compared with state of the art, our proposal demonstrates to be faster, at the expense of slightly more hardware resources. Further, the Area-Time square metric was computed which indicated that our proposed scheme outperforms the state of the art reverse converter.

Keywords—Modified Chinese Remainder Theorem, Mixed Radix Conversion, Reverse Converter, Carry Save Adder, Carry Propagate Adder.

I. INTRODUCTION

RESIDUE Number System (RNS) offers very useful applications in addition, subtraction, and multiplication dominated arithmetic operations, for example, digital filtering, fast Fourier Transform (FFT), image processing etc. This is due to the inherent properties of RNS such as parallelism, modularity, fault tolerance, and carry free operations [1], [16]. Additionally, it has been shown that RNS based processors can even reduce power dissipation in very large scale integrated circuits system design [2]. However, RNS has not found a wide spread usage in general purpose computing due to the following difficult RNS arithmetic operations: magnitude comparison, sign detection, overflow detection, moduli selection, reverse and forward conversions.

Moduli selection is one of the greatest RNS challenges.

This is because, the speed and complexity of the resulting RNS architecture is dependent on the form and the number of moduli set. It has been well established that powers-of-two moduli sets simplify the required arithmetic operations and generate efficient hardware implementations of the RNS architecture [3].

Many moduli sets with their accompanying reverse converters have been proposed. Among these are \( \{2^n, 2^n-1, 2^{n+1}+1 \} \) [5], [6], [7], \( \{2^n, 2^n-1, 2^{n+1}-1 \} \) [8], \( \{2^{2n+1}-1, 2^n, 2^n-1 \} \) [17], and the most popular length-3 moduli set \( \{2^n, 2^n+1, 2^n-1 \} \) [4], [12].

In recent years, due to the increasing demand for some applications that require larger dynamic range and increased parallelism, some length 4 moduli sets such as: \( \{2^n-1, 2^n, 2^n+1, 2^{n+1}-1 \} \) [10], [18], \( \{2^n-1, 2^n, 2^n+1, 2^{n+1}+1 \} \) [9], [18], \( \{2^n-1, 2^n, 2^n+1, 2^{n+1}+1 \} \) [11] and \( \{2^n-1, 2^n, 2^n+1, 2^{2n+1}+1 \} \) [14] generally referred to as 4-moduli supersets of \( \{2^n, 2^n+1, 2^n-1 \} \) have been investigated with their respective reverse conversion algorithms. Some of these conversion algorithms use the Chinese Remainder Theorem (CRT), a combination of CRT and Mixed Radix Conversion (MRC) algorithms [15], a combination of New CRT and MRC [14]. In [14], Molahosseini et al., (2010), introduced the two new 4-moduli sets \( \{2^n-1, 2^n, 2^n+1, 2^{2n+1}+1 \} \) and \( \{2^n-1, 2^n, 2^n+1, 2^{2n+1}+1 \} \). Their proposed reverse converter for \( \{2^n-1, 2^n, 2^n+1, 2^{2n+1}+1 \} \) moduli set was obtained from New CRT II with better performance and hardware requirement when compared with other equivalent \( 5n \) bit dynamic range state of the art reverse converters. The area, speed and the hardware complexity of the resulting reverse converter for the 4-moduli set \( \{2^n, 2^n+1, 1, 2^{n+2}+1 \} \) in [14] can be further reduced and improved.

In this paper, we propose a novel hybridized reverse converter for the moduli set \( \{2^n, 2^n+1, 1, 2^{n+1}+1 \} \). First, the proposed algorithm is based on a hybridization of a modified CRT and MRC methods, resulting in a two level design. In the first level, the equivalent weighted number of the residues \( x_1, x_2, x_3 \) is obtained by using the algorithm presented in [12] for the popular moduli set \( \{2^n, 2^n+1, 2^n-1 \} \). Next, the resulting weighted number equivalent from the first level is hybridized with the fourth residue \( x_4 \) using MRC with respect to the composite moduli set \( \{2^n, 2^n+1, 2^{2n+1}+1 \} \).

The resulting architecture is further simplified in order to obtain a reverse converter that utilizes only Carry Save Adders (CSAs) and Carry Propagate Adders (CPAs). Theoretically speaking, the proposed converter outperforms the state of the art reverse converter.

The rest of the paper is structured as follows. Section II provides a brief background information. In Section III, the proposed algorithm is formulated. Section IV describes the hardware implementation of the proposed algorithm and Section V evaluates the performance of the proposed scheme. Finally, the paper is concluded in Section VI.
II. BACKGROUND

Given a moduli set \( \{m_i\}_{i=1}^k \), the residues \( (x_1, x_2, \ldots, x_k) \) can be converted into the corresponding decimal number \( X \) in the following ways: First, by the use of the well known CRT, which is given as [1]:

\[
X = \sum_{i=1}^{k} m_i^{-1} \left| m_i x_i \right|_M
\]

where \( M = \prod_{i=1}^{k} m_i \), \( M_i = \frac{M}{m_i} \) and \( M_i^{-1} \) is the multiplicative inverse of \( M_i \) with respect to \( m_i \).

Second, the MRC, can also be used. Suppose we have a residue number representation \( (x_1, x_2, \ldots, x_k) \) with respect to the moduli set \( \{m_i\}_{i=1}^k \) and Mixed Radix Digits (MRDs), \( \{a_i\}_{i=1,k} \), the decimal equivalent of the residues can be computed as follows [11]:

\[
X = a_1 + a_2 m_1 + a_3 m_1 m_2 + \ldots + a_n m_1 m_2 \ldots m_{k-1}, \tag{2}
\]

where the MRDs are given as

\[
a_1 = x_1, \\
a_2 = \left( x_2 - a_1 \right) \left| m_1^{-1} \right|_{m_2}, \\
a_3 = \left( (x_3 - a_1) \left| m_1^{-1} \right|_{m_2} - a_2 \right) \left| m_2^{-1} \right|_{m_3} \\
\vdots \\
a_k = \left( ((x_k - a_1) \left| m_1^{-1} \right|_{m_2} - a_2 \right) \left| m_2^{-1} \right|_{m_3} - \ldots - a_{k-1} \right) \left| m_{k-1}^{-1} \right|_{m_k} \tag{3}
\]

Third, the New CRT I proposed in [13] could also be used to convert RNS to its decimal equivalent. Given a 4-moduli set \( \{P_1, P_2, P_3, P_4\} \), the number \( X \) can be converted from its residue representation \( (x_1, x_2, x_3, x_4) \) as follows:

\[
X = x_1 + P_1 k_1 (x_2 - x_1) + k_2 P_2 (x_3 - x_2) + k_3 P_3 P_4 (x_4 - x_3), \tag{4}
\]

where \( k_1 P_1 | P_2 P_3 P_4 = 1 \) \( \tag{5} \)

\( |k_2 P_2| P_3 P_4 = 1 \) \( \tag{6} \)

\( |k_3 P_3 P_4| P_1 P_2 = 1 \) \( \tag{7} \)

Similarly, the New CRT II [14] defined for a 4 moduli set \( \{P_1, P_2, P_3, P_4\} \), the number \( X \) can be computed from its corresponding residues \( (x_1, x_2, x_3, x_4) \) by:

\[
X = Z + P_1 P_2 k_1 (Y - Z) | P_3 P_4, \tag{8}
\]

\[
Z = x_1 + P_1 k_2 (x_2 - x_1) | P_2, \tag{9}
\]

\[
Y = x_3 + P_3 | k_3 (x_4 - x_3) | P_4, \tag{10}
\]

where \( |k_1 P_1 P_2| P_3 P_4 = 1 \) \( \tag{11} \)

|\( k_2 P_2 | P_3 P_4 = 1 \) \( \tag{12} \)

|\( k_3 P_3 P_4| P_1 P_2 = 1 \) \( \tag{13} \)

where \( k_1, k_2, \) and \( k_3 \) are the multiplicative inverses.

III. PROPOSED ALGORITHM

Given \( \{2^n, 2^n + 1, 2^n - 1, 2^{n+1} - 1\} \) as the 4-moduli set with corresponding residues \( (x_1, x_2, x_3, x_4) \), the proposed algorithm which consists of two levels is formulated using the following theorems:

Given the moduli set \( \{m_1, m_2, m_3\} \) with \( m_1 = 2^n, m_2 = 2^n + 1, and m_3 = 2^n - 1 \), the decimal equivalent of the residue numbers \( (x_1, x_2, x_3) \) is computed as

\[
A = m_1 \left| \frac{A}{m_1} \right| + x_1 \tag{14}
\]

where \( \left| \frac{A}{m_1} \right| = \left| u_1 + u_2 + u_3 | m_1 m_2 m_3, u_1 = (\frac{1}{m_1} - m_2 + 1) x_1 \right|, u_2 = (m_1 m_2^{-1} - 1) x_2 \) and \( u_3 = \frac{m_3}{m_2} x_3 \)

\[
\frac{A}{m_1} \] can be represented as:

\[
\alpha = \left| \frac{A}{m_1} \right| = \left| u_1 + u_1 + u_4 + u_2 + u_3 \right|_{2^{n-1}}, \tag{15}
\]

where

\[
u_1 = x_1, n = \frac{x_1}{n} \tag{16}
\]

\[
u_2 = x_2, n = \frac{x_2}{n} \tag{17}
\]

\[
u_3 = x_3, n = \frac{x_3}{n} \tag{18}
\]

\[
u_4 = x_4, n = \frac{x_4}{n} \tag{19}
\]

\[
u_5 = x_5, n = \frac{x_5}{n} \tag{20}
\]

Proof:

This theorem has been proved in [12].

Given \( \{m_1 = 2^n, m_2 = 2^n + 1, m_3 = 2^n - 1, m_4 = 2^{n+1} - 1\} \) as the superset, (21) holds true:

\[
k = \left| (2^n (2^n - 1))^{-1} \right|_{2^{n+1} - 1} = 2^{n+1} - 2^{n+2} - 1 \tag{21}
\]
Proof:
If it can be demonstrated that 
\[
\left(\frac{2^n(2^{2n} - 1) \times (2^{n+1} - 2^{n+2} - 1)}{2^{2n+1} - 2^{n+2} - 1}\right)_{2^{2n+1}} = 1
\]
with respect to \(2^{n+1} + 1\):
\[
\left(\frac{2^n(2^{2n} - 1) \times (2^{n+1} - 2^{n+2} - 1)}{2^{2n+1} - 2^{n+2} - 1}\right)_{2^{2n+1}} = 1
\]
\[
\left(\frac{2^n(2^{2n} - 1) \times (2^{n+1} - 2^{n+2})}{2^{2n+1} - 2^{n+2} - 1}\right)_{2^{2n+1}} = 1
\]
\[
\left(\frac{1}{2} \times 2^n + 1 - \frac{3}{2} \times 2^n + 2^n\right)_{2^{2n+1}} = 1
\]
\[
2^{n+1}(1 - 3) + 2^n + 1 = 1
\]

Given \(\{2^n, 2^n + 1, 2^n - 1, 2^{n+1} - 1\}\) as the superset, the decimal equivalent \(X\) of the RNS number \(\{x_1, x_2, x_3, x_4\}\) can be computed as:

\[
X = A + a_4(2^{3n} - 2^n)
\]

where \(a_4 = [(x_4 - A)k]_{2^{2n+1}}\), \(A\) and \(k\) are given by (14) and (21) respectively.

Proof: Given the moduli set \(\{m_1 = 2^n, m_2 = 2^n + 1, m_3 = 2^n - 1, m_4 = 2^{n+1} - 1\}\) with residues \(\{x_1, x_2, x_3, x_4\}\), using the MRC, its decimal equivalent is computed using (2) for \(\{a_1\}_{i=1}^n\). Now, by considering the moduli set \(\{2^n, 2^n + 1, 2^n - 1\}\) with residues \(\{x_1, x_2, x_3\}\), its decimal equivalent is computed by using (14). Therefore \(A = a_1 + a_2m_1 + a_3m_3m_2\). Next, we consider the composite set \(\{2^n - 2^n, 2^{n+1} - 1\}\) with residues \(\{A, x_4\}\).

For a two moduli set \(\{m_A, m_4\}\), (2) becomes

\[
X = A + a_4m_1m_2m_3 = A + a_4(2^{3n} - 2^n)
\]

In order to reduce the hardware complexity, we use the following properties [11] to simplify (23).

Property 1: The multiplication of a residue number by \(2^k\) in modulo \((2^p - 1)\) is computed by \(k\) bit circular left shifting

Property 2: A negative number in modulo \((2^p - 1)\) is calculated by subtracting the number in question from \((2^p - 1)\). In binary representation, the ones complement of the number gives the result.

Let the residues \(\{x_1, x_2, x_3\}\) have binary representation as follows:

\[
x_1 = \underbrace{x_{1,n-1} x_{1,n-2} \ldots x_{1,1} x_{1,0}}_{n}
\]

\[
x_2 = \underbrace{x_{2,n} x_{2,n-1} \ldots x_{2,1} x_{2,0}}_{n+1}
\]

\[
x_3 = \underbrace{x_{3,n} x_{3,n-1} \ldots x_{3,1} x_{3,0}}_{n}
\]

\[
x_4 = \underbrace{x_{4,2n} x_{4,2n-1} \ldots x_{4,1} x_{4,0}}_{2n+1}
\]

Note that:

\[
a_4 = \left|\left(x_4 - A\right)(2^{2n+1} - 2^{n+2} - 1)\right|_{2^{2n+1}}
\]

\[
a_4 = \left|\left(x_4 - A\right)(2^{2n+1} - 2^{n+2} - 1)\right|_{2^{2n+1}} = 1
\]

\[
= \left|\left(x_4 - A\right)(2^{2n+1} - 2^{n+2} - 1)\right|_{2^{2n+1}} = 1
\]

(29)

(30)

(31)

(32)

(33)

(34)

(35)

(36)

(37)

(38)

IV. HARDWARE REALIZATION

The hardware implementation of the proposed reverse converter for the moduli set \(\{2^n, 2^n + 1, 2^n - 1, 2^{n+1} - 1\}\) is based on (14), (15), (34), and (35). The hardware architecture consists of two levels, the first level is based on a modified CRT and utilizes (14) and (15). Implementation of (15) requires a five operand modulo \(2^{2n} - 1\) adder, where \(2n\) bit numbers \(u_1, u_13, u_3, u_2,\) and \(u_23\) are added with three
levels of $2n$ bit Carry Save Adder (CSA) with End Around Carry (EAC) followed by a $2n$ bit Carry Propagate Adder (CPA) with a carry in of 1. It must be noted that, two operands in this level have $n$ and $n-1$ most significant bit input equal to 1. This will result in the final one’s complement adder always generating an end around carry. This phenomenon demonstrates that, the one’s complement adder can be reduced to a normal CPA with a constant carry-in equal to 1. This therefore makes the delay $t_{CPA}(2n)$. Note also that, the computation of (14) requires no additional hardware since the desired result is obtained by concatenating $n$ bit number $x_1$ with $\alpha$.

In the second level, (34) which consists of three $2n + 1$ bit numbers $a_{41}, a_{42}'$, and $a_{42}''$ are added with two levels of $2n+1$ bit CSA with EAC followed by a $(2n+1)$ bit CPA. The addition of the operands in (34) modulo $(2^{2n+1} - 1)$ can be accelerated with anticipated computation. Thus, we compute $s_4 + c_4$ for both $c_{3n} = 0$ and $c_{3n} = 1$ and the right result is selected with a multiplexer. This process is concluded with the computation of (35) which requires $(5n+1)$ bit subtractor implemented by $(5n+1)$ bit regular CPA with a constant carry in of 1. Since $\alpha$ is a $3n$ bit number, the computation of (35) requires no additional hardware as it is easily obtained by concatenating $\alpha$ with the results of $(5n+1)$ bits of sum of (36) and (37). The proposed hardware structure is depicted by Fig. 1.

![Proposed hardware Structure](image)

**Fig. 1. Proposed hardware Structure**

**V. PERFORMANCE ANALYSIS**

The performance of the proposed reverse converter is evaluated theoretically in terms of area cost and conversion delay. We compare our proposal with equivalent state of the art reverse converters presented in [14] and [15]. The hardware utilization of our proposal and that of state of the art is computed in terms of Full Adders (FAs) and Half Adders (HAs).

The total delay of our reverse converter is the sum of all the delay of the two levels mentioned above. For the first level, the delay is $(2n + 3)t_{FA}$, while it is $(7n + 3)t_{FA} + t_{MUX}$ in the second level. Therefore, the total delay of the proposed converter is $(9n + 6)t_{FA} + t_{MUX}$ which is faster than that presented in both [14] and [15]. It is clear from Table I that our proposal is faster than the state of the art presented in both [14] and [15] at the expense of slightly more area resources. In order to obtain an adequate comparison, the Area-Time square ($\Delta \tau^2$) efficiency metric was used. The $\Delta \tau^2$ metric suggests that our proposed scheme is more efficient than the state of the art.


P.V.A. Mohan. RNS-to-Binary Converter for a New three moduli set \( \{2^{n+1} - 1, 2^n, 2^n - 1\}\). IEEE Transactions on Circuits and Systems-I, Vol. 54, No.9, pp. 775-779, Sep. 2007.


P.V.A. Mohan, A.B. Premkumar. RNS-to-Binary Converters for two Four moduli set \( \{2^n - 1, 2^n, 2^n + 1, 2^{n+1} - 1\}\) and \( \{2^{n-1} - 1, 2^n, 2^n + 1, 2^{n+1} + 1\}\). IEEE Transactions on Circuits and Systems-I, Regular papers. Vol. 54, No.6, June, 2007.


A. S. Molahosseini, K. Navi and C. Dadkhah. Efficient Reverse Converter Designs for the New 4-Moduli sets \( \{2^n - 1, 2^n, 2^n + 1, 2^{n+1} - 1\}\) and \( \{2^{n-1} - 1, 2^n, 2^n + 1\}\). Based on New CRTs. IEEE Transactions on Circuits and Systems-I, Vol. 57, No.4, 823–835. April, 2010.


B. Cao, T. Srikanth, C.H. Chang. Efficient reverse converters for the four-moduli sets \( \{2^n - 1, 2^n, 2^n + 1, 2^{n+1} - 1\}\) and \( \{2^{n-1} - 1, 2^n, 2^n + 1, 2^{n+1} + 1\}\). IEEE Proc. Computers and Digital Tech. vol. 152, no. 5, pp. 687-696. Sept. 2005.

Kazeem Alagbe Gbolagade received the PhD degree in Computer Engineering from the Delft University of Technology in the Netherlands. He obtained his M.Sc and B.Sc in Computer Science from the University of Ibadan and Born respectively in Nigeria. Presently, he is an Associate Professor in Computer Science and Dean of Faculty of Mathematical Sciences, University for Development Studies, Navrongo, Ghana. He has worked as a visiting researcher at the Technical University of Lisbon in Portugal. His research interests include Digital Logic Design, Computer Arithmetic, Residue Number Systems, and VLSI Design.

Edem Kwedzo Bankas received the M.Ed degree in Computer Education and Technology from Ohio University, USA, and the B.Ed degree in Mathematics from the University of Education, Winneba, Ghana. He is currently working towards the Ph.D degree. His research is in the area of Residue Number Systems, Computer Arithmetic, and Digital Logic Design. He is a lecturer at the Department of Computer Science, Faculty of Mathematical Sciences, University for Development Studies, Navrongo, Ghana.