MODIFIED GENETIC ALGORITHM BASED SOFTWARE RELIABILITY USING SPRT: RAYLEIGH MODEL GENETIC ALGORITHM BASED SOFTWARE RELIABILITY USING SPRT: RAYLEIGH MODEL.”

: In Classical Hypothesis testing volumes of data is to be collected and then the conclusions are drawn, which may need more time. But, Sequential Analysis of Statistical science could be adopted in order to decide upon the reliability or unreliability of the developed software very quickly. The procedure adopted for this is, Sequential Probability Ratio Test (SPRT). It is designed for continuous monitoring. The likelihood based SPRT proposed by Wald is very general and it can be used for many different probability distributions. In the present paper we propose the performance of SPRT on 6 data sets of Time domain data using Rayleigh model and analyzed the results. The parameters are estimated using Modified Genetic Algorithm .


Introduction
Sequential analysis is a method of statistical inference whose main feature is that the number of observations required by the procedure is not determined in advance. The decision to end the observations depends, at each stage, on the results of the samples already taken. (SPRT), which is usually applied in situations, requires a decision between two simple hypothesis or a single decision point. Wald's (1947) SPRT procedure has been used to classify the software under test into one of two categories (e.g., reliable/unreliable, pass/fail, certified/noncertified) (Reckase, 1983). Wald's procedure is particularly relevant if the data is collected sequentially. Classical Hypothesis Testing is different from Sequential Analysis. In Classical Hypothesis testing, the number of cases tested or collected is fixed at the beginning of the experiment. In this method, the analysis is made and conclusions are drawn after collecting the complete data.
In the analysis of software failure data, either TBFs or failure count in a given time interval is dealt with.If it is further assumed that the average number of recorded failures in a given time interval is directly proportional to the length of the interval and the random number of failure occurrences in the interval is explained by a Poisson process. Then it is known that the probability equation of ( ) ( ) Stieber (1997) observes that, the application of SRGMs may be difficult and reliability predictions can be misleading,if classical testing strategies are used. However, he observes that statistical methods can be successfully applied to the failure data. He demonstrated his observation by applying the well-known sequential probability ratio test of Wald for a software failure data to detect unreliable software components and compare the reliability of different software versions. In this chapter the popular SRGM -Rayleigh model is considered and the principle of Stieber is adopted in detecting unreliable software in order to accept or reject the developed software. The theory proposed by Stieber is presented in Section 2 for a ready reference. Extension of this theory to the considered SRGM is presented in Section 3. Modified Genetic Algorithm based parameter estimation method is presented in Section 4. Application of the decision rule to detect unreliable software with reference to the SRGM is given in Section 5.

Sequential Test for a Poisson Process
A.Wald, developed the SPRT at Columbia University in 1943. A big advantage of sequential tests is that they require fewer observations (time) on the average than fixed sample size tests. SPRTs are widely used for statistical quality control in manufacturing processes.  ' precisely. But we want to reject the system with a high probability if the data suggest that the failure rate is larger than 1  and accept it with a high probability, if it is smaller than 0  . As always with statistical tests, there is some risk to get the wrong answers. So we have to specify two (small) numbers ' ' and '  ', where ' ' is the probability of falsely rejecting the system. That is rejecting the system even if 0   . This is the "producer's" risk. ''  is the probability of falsely accepting the system .That is accepting the system even if 1   . This is the "consumer's" risk. Wald's classical SPRT is very sensitive to the choice of relative risk required in the specification of the alternative hypothesis. With the classical SPRT, tests are performed continuously at every time point is greater than or equal to a constant say A, less than or equal to a constant say B or in between the constants A and B. That is, we decide the given software product as unreliable, reliable or continue (Satyaprasad, 2007) the test process with one more observation in failure data, according to The approximate values of the constants A and B are taken as To continue the test with one more observation on

Sequential Test for Software Reliability Growth Models
In Section 2, for the Poisson process it is known that the expected value of ( ) Continue the test procedure as long as Substituting the appropriate expressions of the respective mean value function -( ) mt of Rayleigh we get the respective decision rules and are given in following lines Acceptance region: a e e a e e Nt ee ee  

Midified Genetic Algorithm
Genetic Algorithm (GA) has been popularly used to solve various optimization problems. GA has advantages of easy implementation with large search space and rapid convergence on good quality solutions. It does not impose restrictions on the continuity, the existence of derivatives, and the unimodality of evaluation functions. Traditional GA has several steps for searching process: • Chromosome representation; GA simulates the initial population of parametric solution represented as chromosomes. Each chromosome is encoded as string of bits. Since the parameters of SRGMs are usually real numbers, we proposed an IEEE floating-point standard to encode chromosomes.

Chromosome Representation and Weighted Bit Mutation
• Fitness function; ➢ least squares estimation (LSE) Where, MSE is a measure to compare the differences between actual values and estimators. • Selection scheme: This scheme is to select the candidate chromosomes from the current population based on their fitness values. Our goal is to maximize fitness function for finding the best parameters. With these fitness values, we can further adopt roulette wheel selection and uniform crossover to choose candidate chromosomes. Arebuilding mechanism is proposed. Among each generation, one best chromosome is kept at the end of the population to avoid disappearance from the selection scheme. This mechanism does not violate GA's original purpose. • Crossover operator: Two chromosomes are chosen from the population and are exchanged in part with each other in order to improve their fitness value. The uniform crossover is one of the simplest forms (Goldberg, 1989). The crossover may happen at different bits with a probability called crossover rate, P. This rate typically ranges from 0.5 to 0.8 from GA literatures (Jiang, 2006). It is decide to adopt uniform crossover in our experiments. • Mutation operator: In IEEE floating-point format,it is found that some bits are less efficient during bit mutation. The sign bit mutation is useless as the estimated parameter are a positive real numbers. Similarly, if we mutate at a very high exponential bit or at a very low fractional bit, the whole string will respectively be 2 ±128 times the original or only be changed slightly. In fact, these mutations may be too severe or negligible. Depending on Sensitivity analysis on different bit mutations, a weighted bit mutation is provided. • Stopping criteria: The searching process will iteratively evolve parametric solutions until the maximal generations equal to 10000 trials or the best fitness function does not change in the past 10000 trials.

A. Algorithm for parameter estimation
In this section, we show how to modify the traditional GA to estimate the parameters of SRGMs. The detailed algorithm of MGA is shown below. It is noted that all the proposed mechanisms of MGA are built by using Java programming language. 1) Initialize a population of chromosomes randomly 2) FOR (Iteration i=1; i<=Maximum generation && termination condition=FALSE; i=i+1) a) Calculate fitness for all individual chromosomes b) Reproduce offspring by roulette selection c) Choose two chromosomes from the population in order and randomize a probability p d) IF p < Crossover rate THEN i. Generate two offsprings by recombining two chromosomes. ENDIF e) Choose a chromosome from the population in order and randomize a probability q f) IF q < Mutation rate THEN g) mutate the chosen chromosome at a weighted bit position h) ENDIF i) Keep the fittest parent in the end of population j) Check termination condition 3) ENDFOR 4) Output estimated parameters

SPRT Analysis of Data Sets: Time domain
In this section, the developed SPRT methodology is shown for a software failure data which is of time domain. The decision rules based on the considered mean value function for fivedifferent data sets, borrowed from Pham (2006)