EFFICIENT OPERATIONS IN LARGE FINITE FIELDS FOR ELLIPTIC CURVE CRYPTOGRAPHIC

An efficient method to compute the finite field multiplication for Elliptic Curve point multiplication at high speed encryption of the message is presented. The methods of the operations are based on dynamic lookup table and modified Horner rule method. The modified Horner rule method is not only to finite field operations but also to Elliptic curve scalar multiplication in the encryption and decryption. By comparison with using Russian Peasant method and in the new proposed method, one of the advantages of utilizing the proposed algorithm is that in the Elliptic Curve point addition are reduced by a factor of three in GF (2163). Therefore, using the Algorithm 1 running on Intel CPU, computation cost of the multiplication method is above 70% faster than using standard multiplication by Russian Peasant method. Ultimately, the proposed Algorithm 1 for evaluating multiplication can be made regular, simple and suitable for software implementations. 
 


INTRODUCTION
In theory, a finite field is an algebraic structure with established operations for addition, subtraction, multiplication, and division by satisfying an Abelian group. These operations have following four properties closure, associativity, commutativity, and having an inverse element. Galois Fields GF(2m) have a wide variety of applications utilized in the cryptographic standards of ANSI and error correction code (Chen authors, 2016). The industry uses Elliptic curve groups over the large finite fields of GF(2m) and GF(p), Koblitz EC groups in GF(2m) (Koblitz, 1987) faster than GF(p). The operation modulo is using binary irreducible polynomial in finite field (Scott, 2007;Hasan authors, 1992) that is suitable for resource-constrained systems, such as cellphone, networked wireless sensors, and smart cards. The most efficient and secure cryptographic system in use today is known as elliptic curve cryptography (ECC) and is based on the concept of elliptic curves built over Galois fields (Savas & Ko_c, 2010). NIST recommended curves: Koblitz GF(2m), where m is 163, 233, 283, 409, and 571. Elliptic curves are a type of cubic equation of the form y 2 =x 3 +ax+b, where a and b represent coefficients. When elliptic curves are operation in Galois Field, the points on the curve can be form an Abelian group making it operations to addition of two points on the curve, or the point doubling. ECC encryption and decryption data in Galois Field that is popular the form y 2 +xy=x 3 +ax 2 +b, where a and b represent coefficients. The operation on elliptic curves is scalar multiplication (Ansari & Hasan, 2011) which refers to multiplying a Point P by an integer k, resulting in the Point k*P scalar multiplication not only dominates the execution time of ECC algorithms, but it is also essential to the security in systems. In GF(2m), the addition and the subtraction are the same XOR instruction in the processor. However, both multiplication and inversion operations are complicate in cryptographic systems; they are due to finite field of size m (Kobayashi & Takagi, 2008;Jing authors, 2006;Luo authors, 2012;Wang authors, 1983;Mahboob & Ikram, 2005;Brwon, 1971). The multiplication and inverse are required while using the Diffie-Hellman key exchange protocol on an elliptic curve (Dong & Li, 2008), as specified in ANSI X9.62 is required many multiplications and inverses. Therefore, to develop efficient arithmetic operations have high-speed computation that needs for using available technologies. In the past, the multiplication algorithm can be used to look-up tables, which have proposed in (Mahboob & Ikram, 2005). The lookup table method is pre-compute to reduce the number of operations required during the computation through the pre-computation and to reduce the effective computation time for multiplication by suitable width of the registers of the processor to achieve higher computation speed. The new algorithm has two properties: First, it utilizes the dynamic lookup table by precomputing input data and save memory. Second, it uses modified Horner rule for iteration loop and can determine the table entries quickly. The operation of inverse usually utilizes the polynomial modular mathematic called Euclidean algorithm (Brwon, 1971;Dong & Li, 2008), in which is hard to be known the worst cases execution time. The Fermat's Little Theorem also uses to compute inverse because the worst cases execution time can found. The adaptation of Itoh-Tsujii method (Guajardo & Paar, 2002) for standard basis, particularly for Optimal Extension Fields has been effective in achieving fast inversion. However, despite recent improvements, inversion is still the slowest operation in elliptic curve implementations (Agnew authors, 1993;Choi authors, 2002;Kumar, 2006). Finial, this issue is addressed by proposing multiplication methods that the execution time is faster than which measured for others multiplication execution time and it can be speed up establish a shared security over an insecure channel, Elliptic curve Diffie-Hellman (ECDH) (Diffie & Hellman, 1976).

PRELIMINARIES
where the polynomial F(x) is the irreducible polynomial. There is common method utilizing Horner rule for computing multiplication, which is rewritten (1) in the following form, where the B(x) polynomial is represented as binary vector ) , , , The ECC with multiplication is irreducible polynomial 1 2 1 . In (1), the Russian Peasant method can be written as a function in C programming as follows: Many multiplications are required in common method of the inverse, which need 2 m -3 multiplications for calculating inverse element. In others word, it need more computation time for computing inverse but using number theory can reducing multiplication that is only m-1 multiplications. The inverse operation also can use the extended Euclidean algorithm. ).
The compute the inverse element over finite field GF(2 m ) and the extended Euclidean algorithm also can use for computing multiplicative inverse in finite field. However, Euclidean division operation is hard to know the maximum execution time when the number is large. Points P=(x1, y1) and Q=(x2, y2) on the curve, assume be the line that intersects P and Q, the value of the λ slope needs calculating inverse.
b ax x xy y y We know that the slope is

IMPLEMENTATION OF MULTIPLICATION IN LARGE FIELDS GF(2 m ).
Using a two-term polynomial w l a x a + is designed for finite field multiplication, where , where the value of the remainder q is either 0 or 1. Therefore, the computation of can be replaced by the following equation: A substitution of Equation (7) into Equation (1) yields, Note that denotes the largest integer less than or equal to x. This implies that  can be represented by the use of Horner's rule form. It becomes x is required module F(x) that can be reduce polynomial degree by a module lookup table M. The lookup table M is by making the polynomial F(x). An irreducible polynomial F(x) is a trinomial x m +x k +1 or a binary pentanomial x m +x k +x k1 +x k2 +1 (Hankerson authors, 2004). First, the modulo operation with lookup table is in term of the irreducible polynomial F(x)=fmx m +fkx k +fk1x k1 +fk2x k2 +1, where fi is belong to {0, 1}, for computing modulo polynomial. Let f= fkx k +fk1x k1 +fk2x k2 +1 are the value of polynomial. Scheme 1, in step 2 needs   table in Table 2.  2) Set the initial message C to be zero.   2) Set the initial the value of C to be zero.  Table 4.    Table 3 and in Table 4. Table 3 and Table 4 operation, which requires only 53 iterations for evaluating a multiplication, however, they would be increased time to make dynamic lookup table. Characteristic two in this is case due to work described in (Gaudry authors, 2000) m to be prime. According to Scheme 1 (i.e., using Table 1 and Table 2) and Scheme 2 (i.e., using Table 3 and Table 4), If the value of m is odd number, the multiplications can be designed as Algorithm 1.

Scheme 2 used
So, Algorithm 1 can use reducing polynomial f(x) in Table 5 and wd-term polynomial A(x) in Table 6. These precomputing the table of the size is 2 wd , where wd is the number of terms polynomial (i.e., ). The flowchart is as shown in Figure 1.

RESULTS AND DISCUSSIONS
Regarding the efficiency of the proposed algorithm for encryption and decryption evaluation for EEC, a software simulation is performed on an Intel® Core™ i5 at 3.07 GHz Windows 7 PC using C++ program. The element in ) 2 ( m GF is implemented using an unsigned character data type. For any two elements in the finite field, the multiplication is described in (Genser et al. 2009), and the addition operation is implemented directly by using the bitwise C++ XOR. The multiplications size of m evaluating time of the 163, 233, 283, 409, and 571 listed in Table 7. The computational time of these methods over 100,000 times of message computations is listed in Table 8. Table 7: The multiplication of the making table time with different finite filed  m  wd-term     Algorithm 1, the multiplication compute with m=163 and wd=5 that can be approximately 70% faster than Russian Peasant method. For some field, the computations for the multiplication had been executed over 100,000 data for testing. The large field m = 163 and wd=2 the proposed is used to compute multiplication, which requires 4+83=87 Left Shift (i.e., <<) operand and 1+83*2=164 XOR (i.e., ^) operand. The inverse computing process is required the number of multiplications and square are the number of m. The multiplication can be applying Algorithm 1 to evaluate scalar multiplication in Elliptic Curve. Algorithm 1 reducing time performance is better than Russian Peasant method that shown in Figure 2.

CONCLUSIONS AND RECOMMENDATIONS
In this paper, dynamic lookup table method using encryption and decryption in the ECC is presented. In Figure  2, Algorithm 1 is actually faster than Russian Peasant method, where wd > 1 in all instances m bits. The proposed multiplication method also can perform quickly inverse operation. If memory consumption in embedded systems is an acceptable range, the proposed method can be readily adaptable for speeding up and memory used the point multiplication in ECC. Thus, Algorithm 1 can use in different the value of wd to divide the polynomial A(x) for encryption and decryption that can save a lot of the memory in Table 10 when the value of wd is small. It is evident that Algorithm 1 is really suitable for software applications in embedded system.

SOURCES OF FUNDING
None.