Article Citation: Yan-Haw Chen, and Chien-Hsing Huang. (2020).
EFFICIENT OPERATIONS IN LARGE FINITE FIELDS FOR ELLIPTIC CURVE CRYPTOGRAPHIC. International
Journal of Engineering Technologies and Management Research, 7(6), 141-151. https://doi.org/10.29121/ijetmr.v7.i6.2020.712 Published Date: 27 June 2020 Keywords: Elliptic Curve Encryption Finte Field Horner Rule Dynamic Lookup Table Multiplication An efficient method to compute the finite field multiplication for Elliptic Curve point multiplication at high speed encryption of the message is presented. The methods of the operations are based on dynamic lookup table and modified Horner rule method. The modified Horner rule method is not only to finite field operations but also to Elliptic curve scalar multiplication in the encryption and decryption. By comparison with using Russian Peasant method and in the new proposed method, one of the advantages of utilizing the proposed algorithm is that in the Elliptic Curve point addition are reduced by a factor of three in GF (2163). Therefore, using the Algorithm 1 running on Intel CPU, computation cost of the multiplication method is above 70% faster than using standard multiplication by Russian Peasant method. Ultimately, the proposed Algorithm 1 for evaluating multiplication can be made regular, simple and suitable for software implementations.
1. INTRODUCTIONIn theory, a finite field is an algebraic structure with established operations for addition, subtraction, multiplication, and division by satisfying an Abelian group. These operations have following four properties closure, associativity, commutativity, and having an inverse element. Galois Fields GF(2m) have a wide variety of applications utilized in the cryptographic standards of ANSI and error correction code (Chen authors, 2016). The industry uses Elliptic curve groups over the large finite fields of GF(2m) and GF(p), Koblitz EC groups in GF(2m) (Koblitz, 1987) faster than GF(p). The operation modulo is using binary irreducible polynomial in finite field (Scott, 2007; Hasan authors, 1992) that is suitable for resource-constrained systems, such as cellphone, networked wireless sensors, and smart cards. The most efficient and secure cryptographic system in use today is known as elliptic curve cryptography (ECC) and is based on the concept of elliptic curves built over Galois fields (Savas & Ko_c, 2010). NIST recommended curves: Koblitz GF(2m), where m is 163, 233, 283, 409, and 571. Elliptic curves are a type of cubic equation of the form y2=x3+ax+b, where a and b represent coefficients. When elliptic curves are operation in Galois Field, the points on the curve can be form an Abelian group making it operations to addition of two points on the curve, or the point doubling. ECC encryption and decryption data in Galois Field that is popular the form y2+xy=x3+ax2+b, where a and b represent coefficients. The operation on elliptic curves is scalar multiplication (Ansari & Hasan, 2011) which refers to multiplying a Point P by an integer k, resulting in the Point k*P scalar multiplication not only dominates the execution time of ECC algorithms, but it is also essential to the security in systems. In GF(2m), the addition and the subtraction are the same XOR instruction in the processor. However, both multiplication and inversion operations are complicate in cryptographic systems; they are due to finite field of size m (Kobayashi & Takagi, 2008; Jing authors, 2006; Luo authors, 2012; Wang authors, 1983; Mahboob & Ikram, 2005; Brwon, 1971). The multiplication and inverse are required while using the Diffie-Hellman key exchange protocol on an elliptic curve (Dong & Li, 2008), as specified in ANSI X9.62 is required many multiplications and inverses. Therefore, to develop efficient arithmetic operations have high-speed computation that needs for using available technologies. In the past, the multiplication algorithm can be used to look-up tables, which have proposed in (Mahboob & Ikram, 2005). The lookup table method is pre-compute to reduce the number of operations required during the computation through the pre-computation and to reduce the effective computation time for multiplication by suitable width of the registers of the processor to achieve higher computation speed. The new algorithm has two properties: First, it utilizes the dynamic lookup table by precomputing input data and save memory. Second, it uses modified Horner rule for iteration loop and can determine the table entries quickly. The operation of inverse usually utilizes the polynomial modular mathematic called Euclidean algorithm (Brwon, 1971; Dong & Li, 2008), in which is hard to be known the worst cases execution time. The Fermat's Little Theorem also uses to compute inverse because the worst cases execution time can found. The adaptation of Itoh-Tsujii method (Guajardo & Paar, 2002) for standard basis, particularly for Optimal Extension Fields has been effective in achieving fast inversion. However, despite recent improvements, inversion is still the slowest operation in elliptic curve implementations (Agnew authors, 1993; Choi authors, 2002; Kumar, 2006). Finial, this issue is addressed by proposing multiplication methods that the execution time is faster than which measured for others multiplication execution time and it can be speed up establish a shared security over an insecure channel, Elliptic curve Diffie–Hellman (ECDH) (Diffie & Hellman, 1976). 2. PRELIMINARIESLet and be polynomial over the
finite field GF(2m), where ai,
bi {0, 1}. The
multiplication in the finite field is defined as follows:
where the polynomial F(x)
is the irreducible polynomial. There is common method utilizing Horner rule for
computing multiplication, which is rewritten (1) in the following form, where the B(x) polynomial is
represented as binary vector , where .
The ECC with multiplication is irreducible
polynomial .
In (1), the Russian Peasant
method can be written as a
function in C programming as follows:
The
inverse element of A in finite field GF(2m) is
derived by Fermat’s Little Theorem, which is given by
Many
multiplications are required in common method of the inverse, which need 2m-3
multiplications for calculating inverse element. In others word, it need more computation
time for computing inverse but using number theory can reducing
multiplication that is only m-1 multiplications. The inverse operation
also can use the
extended Euclidean algorithm.
Thus, the remainder of division t(x) by mod F(x), where F(x) is irreducible polynomial, that is the multiplicative inverse of A(x) mod F(x). The compute the inverse element over finite field GF(2m) and the extended Euclidean algorithm also can use for computing multiplicative inverse in finite field. However, Euclidean division operation is hard to know the maximum execution time when the number is large. Points P=(x1, y1) and Q=(x2, y2) on the curve, assume. Let be the line that intersects P and Q, the value of the slope needs calculating inverse.
If ≠Q, P
+ Q = R = (x3, y3)
If P = Q,
P + Q = R = 2P = (x3,
y3)
For example, .
Let P and Q be curve point, where . , the point of R is equal P point addition Q point. We know that the slope is . The function of line is . 3. IMPLEMENTATION OF MULTIPLICATION IN LARGE FIELDS GF(2m).Using a two-term polynomial is designed for finite field multiplication, where . Let , where the value of the remainder q is either 0 or 1. Therefore, the computation of can be replaced by the following equation:
A substitution of Equation (7) into Equation (1) yields,
A two-term polynomial can be pre-computed in order to generate a table of values. To see this, let be an array as lookup table, where and . Therefore, a substitution of into Equation (8) yields . Note that denotes the largest integer less than or equal to x. This implies that can be represented by the use of Horner's rule form. It becomes
The product of and is required module F(x) that can be reduce polynomial
degree by a module lookup table M.
The lookup table M is by making the
polynomial F(x). An irreducible
polynomial F(x) is a trinomial xm+xk+1 or a binary
pentanomial xm+xk+xk1+xk2+1 (Hankerson authors, 2004). First, the modulo operation with lookup table
is in term of the irreducible polynomial F(x)=fmxm+fkxk+fk1xk1+fk2xk2+1, where fi is belong to {0, 1}, for computing modulo polynomial.
Let f= fkxk+fk1xk1+fk2xk2+1 are the value of polynomial. Scheme
1, in step 2 needs table M to make L dynamic lookup table. The table M is as shown in Table 1. Note that, the
reduction table method requires that m−k ≥ wb, where wb is the
number of terms polynomial in polynomial A(x). If m-k is less than wb, the reducing polynomial f
to make lookup table cause polynomial B
is dependent. It can’t use lookup table M
for reducing the polynomial degree. The
pre-computing reduction table is shown in Table 1. Table 1: Pre-computing reduction M table
The lookup table M can be
reduced polynomial degree, the leader coefficient of polynomial in degree m-1 multiplied by x is become of degree m that meaning needs the table M to reduce the polynomial degree. The
multiplication by x is represented as
binary form shift left one position (i.e., B=1101,
B << 1=1010). The terms can combinations of two bits for
computing dynamic lookup table in Table 2. Table 2: Two terms dynamic Lookup table
Where bm-1 is leader of the polynomial . Let U= be binary vector , the can be represented as , where is left-shift operation the binary vector (i.e., ). Consequently, the scheme of multiplication for computing the two terms polynomial called Scheme 1, is proposed as follows: Scheme 1:
Multiplication Using Look up Table (LUT) Computation 1) Compute , where m is the element of bit sizes. 2) Set the initial message to be zero. 3) To make reduction table as shown in Table 1 4) Computing table as shown in Table 2. 5) for i = 0 to -1 do 6)
Read bits and from the polynomial A(x) 7) 8) 9) end for i 10) 11) Scheme 1, the lookup table require the memory of the size bits. For example, the GF(2163) requires 4 LUTs. Those tables can be dynamic pre-calculated and can then be stored in memory. The simplest approach is to use a pair of , where and are bit value, for evaluating multiplication in LUTs derived from Scheme 1, which requires only 80 iterations for evaluating a multiplication. If a three-term polynomial is designed for computing multiplication,, It can be divided into unites into A(x). In equation (1) can be replaced by following a three-term polynomial for multiplication. It is called Scheme 2 as follows: Scheme 2: Multiplication
LUT Computation 1) Compute , where m is the element of bit sizes. 2) Set the initial the value of to be zero. 3) To make reduction table as shown in Table 3 4) Computing table as shown in Table 4. 5) for i = 0 to -1 do 6)
Read elements , and 7) 8) 9) end for i 10) 11) Where Table 3: Pre-computing reduction M table
Table 4: Three term dynamic lookup table
Scheme 2, the lookup table require the memory of the size bits. For example, the
GF (2163) requires 8 data in lookup table. Those tables
can be dynamic pre-calculated data to store
in memory. The simplest approach is to use three for , where , , and are bit value, for precomputing data in Table 3 and in Table 4. Scheme 2 used Table 3 and Table 4 operation,
which requires only 53 iterations for evaluating a multiplication, however,
they would be increased time to make dynamic lookup table. Characteristic two
in this is case due to work described in (Gaudry authors, 2000) m
to be prime. According to Scheme 1
(i.e., using Table 1 and Table 2) and Scheme 2 (i.e., using Table 3 and Table 4), If the value of m is odd number, the multiplications can
be designed as Algorithm 1. So, Algorithm 1
can use reducing polynomial f(x)
in Table 5 and wd-term polynomial A(x) in Table 6. These precomputing the table of the size is 2wd, where wd
is the number of terms polynomial (i.e.,
). The flowchart is as shown in Figure 1. Table 5: Pre-computing reduction table
Table 6: -term polynomial for dynamic lookup table
Algorithm 1: Multiplication dynamic Lookup table computation
Figure 1: the
flowchart is large size multiplication operation in GF(2m) 4. RESULTS AND DISCUSSIONSRegarding the efficiency of the
proposed algorithm for encryption and decryption evaluation for EEC, a software
simulation is performed on an Intel® Core™ i5 at 3.07 GHz Windows 7 PC using
C++ program. The element in is
implemented using an unsigned character data type. For any two elements in the
finite field, the multiplication is described in (Genser et al. 2009), and the addition operation is implemented directly by
using the bitwise C++ XOR. The multiplications size of m evaluating time of the 163, 233, 283, 409, and 571 listed in
Table 7. The computational time of these methods over 100,000
times of message computations is listed in Table 8. Table 7: The multiplication of the making table time with different finite filed
Table 8: The multiplication of the looping time with different finite filed
Table 9: The computing time of the multiplication with different finite field
Table 10: The memory usage
Algorithm 1, the multiplication compute with m=163 and wd=5 that can be approximately 70% faster than Russian
Peasant method. For some field, the computations
for the multiplication had been executed over 100,000 data
for testing. The large field m
= 163 and wd=2
the proposed is used to compute multiplication, which requires 4+83=87 Left
Shift (i.e., <<) operand and 1+83*2=164 XOR
(i.e., ^) operand. The inverse computing process is required the number of multiplications and square are the number of m. The multiplication can be applying Algorithm 1 to
evaluate scalar multiplication in Elliptic Curve. Algorithm 1 reducing time performance is better than Russian Peasant method that shown
in Figure 2. Figure 2: The multiplication of m bits operation in finite field 5. CONCLUSIONS AND RECOMMENDATIONSIn this paper, dynamic lookup table method using encryption and decryption in the ECC is presented. In Figure 2, Algorithm 1 is actually faster than Russian Peasant method, where wd > 1 in all instances m bits. The proposed multiplication method also can perform quickly inverse operation. If memory consumption in embedded systems is an acceptable range, the proposed method can be readily adaptable for speeding up and memory used the point multiplication in ECC. Thus, Algorithm 1 can use in different the value of wd to divide the polynomial A(x) for encryption and decryption that can save a lot of the memory in Table 10 when the value of wd is small. It is evident that Algorithm 1 is really suitable for software applications in embedded system. SOURCES OF FUNDINGNone. CONFLICT OF INTERESTNone. ACKNOWLEDGMENTNone. REFERENCES
[1]
Y. H.
Chen, C.
F. Huang, J. Chang, Decoding of
binary quadratic residue codes with hash table, IET Communications, Vol.
10,
No. 1, 2016, 122-130.
[2]
N.
Koblitz, Elliptic curve cryptosystems, Mathematics of Computation, Issue 48, 1987, 203-209.
[3]
M. Scott,
Optimal irreducible polynomials for GF(2m) arithmetic. Cryptology
ePrint Archive, Report 2007/192.
[4]
A.
Hasan, M. Z. Wang, and V. K.
Bhargava, Modular Construction of Low Complexity Parallel Multipliers for a
Class of Finite Fields GF(2m), IEEE Trans. Comput., Vol. 41, No. 8, 1992, 962-971.
[5]
E. Savas
and C. Ko_c, Finite Field Arithmetic for Cryptography, IEEE Journals, Circuits
and Systems Magazine, Vol. 10, No 2, 2010, 40-56.
[6]
B.
Ansari and M. Hasan, High-Performance Architecture of Elliptic Curve Scalar
Multiplication, IEEE Transactions on Computers, Vol. 57, No 11, 2011, 1443-1453.
[7]
Kobayashi
and N. Takagi, A Combined Circuit for Multiplication and Inversion in GF(2m),
IEEE Transactions on Circuits and Systems, Vol 55, No 11,2008, 1144-1148.
[8]
Jing, J.
Chen, Z. Chen and Y. Chen, Low Complexity Architecture for Multiplicative
Inversion in GF(2m), IEEE Asia Pacific Conference on Circuits and Systems
(APCCAS 2006), 2006, 1492-1495.
[9]
Luo, J., Bowers, K. D.,
Oprea, A., and Xu, L. Efficient software implementations of large finite
fields GF (2n) for secure storage applications. ACM Transactions on Storage (TOS), Vol 8, No1, 2012, 2
[10] C. C. Wang, T. K. Truong, H. M. Shao and L.
J. Deutsch, VLSI Architectures for computing Multiplications and Inverses in
GF(2m), TDA Progress Report 42-75, 1983, 52-63.
[11] Mahboob and N. Ikram, Lookup table-based
multiplication technique for GF(2m) with cryptographic significance, IEE Proc.-Commun, Vol. 152, No. 6, 2005, pp. 965-974.
[12] W. S. Brwon, On Euclid’s Algorithm and the
computation of polynomial greatest common divisors, Journal of the Association for Computing Machinery, Vol. 18, 1971, 478-504.
[13] F. Dong and Y. Li, A Novel Shortest Addition
Chains Algorithm Based on Euclid Algorithm, 4th International Conference on
Wireless Communications, Networking and Mobile Computing, 2008, pp. 1-4.
[14] J. Guajardo and C. Paar, Itoh-Tsujii
Inversion in Standard Basis and Its Application in Cryptography and Codes, Designs,
Codes and Cryptography, Vol
25, No 2, 2002, 1573-7586.
[15] G. B. Agnew, R. C. Mullin, and S. A.
Vanstone, An Implementation of Elliptic Curve Cryptosystems over F2155,
IEEE J. Selected Areas in Comm., Vol.
11,
1993, 804-813.
[16] Y. Choi, H. W. Kim, and M. S. Kim,
Implementation of elliptic curve cryptographic coprocessor over GF (2163) for ECC protocols, in Proceedings of the
2002 International Technical Conference on Circuits/Systesm, Computers, and
Communications, 2002, 674-677.
[17] S. Kumar Elliptic Curve Cryptography for
constrained devices, Bochum Research Bibliography, 2006.
[18] W. Diffie and M. E. Hellman, New Directions
in Cryptography, IEEE Trans. Inf. Theory, vol 22, 1976 664-654.
[19] D. Hankerson, A. Menezes, and S. Vanstone, Guide
to Elliptic Curves Cryptogra-phy. Springer, 2004.
[20] P. Gaudry, F. Hess and N.P., Smart. Constructive and destructive facets
of Weil descent on elliptic curves. Preprint, 2000.
This work is licensed under a: Creative Commons Attribution 4.0 International License © IJETMR 2014-2020. All Rights Reserved. |