HYPERSPECTRAL IMAGE CLASSIFICATION BASED ON INTELLIGENT OPTIMIZATION FEATURE SELECTION

Hyperspectral image classification has always been a hot topic. The problem of "dimension disaster" is caused by the high dimension of pixel points and the lack of labeled training sample points. In order to reduce the data dimension, an intelligent optimization algorithm was proposed for feature selection. The new method introduces the principle of mutual information and symmetric uncertainty, constructs the fitness function, selects the candidate feature set with the intelligent optimization algorithm, and obtains the optimal feature set. The SVM classifier was trained in the optimized feature set. In real hyperspectral data set, the new method was compared with various feature selection methods, and the experimental results showed that the optimal feature set has a high classification accuracy.


Introduction
Hyperspectral remote sensing technology can obtain a large number of ground object image information at a distance [1][2][3][4], and take images of the same surface in different bands. The value of a pixel in different bands forms a spectral curve, according to which we can classify and identify ground objects. On the one hand, a large amount of spectral information is conducive to the classification and identification of ground object targets; on the other hand, due to the high correlation between bands, there are a large number of redundant bands, and a shortage of labeled training samples, which lead to the "dimensional disaster" problem to reduce the accuracy of classification and identification. Therefore, how to carry out dimension reduction and find a feature subset for classification and recognition becomes the key to solve the problem.
In the field of hyperspectral image classification, scholars have proposed a variety of feature (band) selection methods. Rashedul [5] proposed a feature selection method based on segmented principal component analysis to classify hyperspectral images. Yu [6] proposed a dimension specification method based on significant feature extraction. Yang [7] proposed the use of separable nonnegative matrix decomposition to achieve hyperspectral band selection. Ding [8] proposed a restricted polymorphic ant colony algorithm for band selection. Xie [9] proposed an unsupervised hyperspectral feature selection method based on fuzzy c-means and gray scale optimizer. Peng [10] proposed a hyperspectral image classification method based on unsupervised feature selection.
The above methods focus on eliminating redundant features one by one and forming feature subsets with the elimination method, without considering the problem from the perspective of global optimization. This paper presents a method of hyperspectral feature selection based on particle swarm optimization. The dimension reduction method introduces the mutual information and symmetric uncertainty principle, constructs the fitness function, then uses the particle swarm optimization algorithm to select the candidate feature set, obtains the optimized feature set and the redundant feature set, and the optimized feature set is used for the subsequent classification and recognition task. In real hyperspectral data set, the new method was compared with various feature selection methods, and the experimental results showed that the optimal feature set has a high classification accuracy.

Materials and Methods
The hyperspectral image is a three-dimensional cube, as shown in Fig.1, where I, J and K correspond to their length, width and spectral dimensions respectively. The kth band image is represented as k B , which is a IJ  dimensional pixel matrix. For one of the pixels, the values of the corresponding points in each band image can be extracted to form an n-dimensional spectral feature. At present, all the bands form the candidate feature set, and the result of feature selection is to obtain the optimal feature subset from which the sample points are formed and the subsequent classification and recognition tasks are completed.  The classification process of hyperspectral images is shown in Fig.2. Firstly, the hyperspectral image is preprocessed to remove the noise band, and the feature set obtained is called candidate feature set B. Then, Particle Swarm Optimization (PSO) [11] was used to select the features from B, and the optimal feature set Q and redundant feature set R were obtained. Finally, the optimal The process of band optimization is to remove redundant bands from the candidate feature set and obtain the optimized feature set. Selecting the best feature combination in the candidate feature set can improve the accuracy of hyperspectral classification. The key to feature optimization by using particle swarm optimization algorithm is the particle coding and the selection of fitness function.
Suppose the candidate feature set is The redundant feature set is R B Q =−. A particle is an n-dimensional vector, each dimension of which represents a band. The ith dimension component in the particle vector indicates the feature selection result, where 1 means the ith feature is selected, and 0 means the feature is not selected. Therefore, the selected band is the band whose component value is 1, as shown in Fig.3. Each value of a particle is the result of a feature selection. For example, in the AVIRIS image, after pre-processing to remove noise bands, there are a total of 200 band images, so a 200-dimensional particle can be used to represent the candidate feature set. Feature selection is to find the feature subset with the strongest classification ability in the candidate feature set. Because of the great correlation between the features, the removal of redundant features can improve the algorithm's recognition accuracy and reduce the computational complexity. The particle swarm optimization algorithm is used for selection, so it is necessary to select a fitness function to identify the classification ability of feature subset. In this paper, a correlation feature evaluation method is adopted, which selects features with low correlation degree to form a feature subset, thus eliminating redundant features. Assuming that the optimal feature subset is Q and the redundant feature subset is R, the fitness function of the particle swarm optimization algorithm is defined as follows: In J, the numerator represents the symmetric uncertainty relationship between optimal features and redundant features, and the denominator represents the symmetric uncertainty relationship between optimal features. U is symmetric uncertainty function. The larger U value indicates that the two feature sets are more correlated, and vice versa. The specific definition is as follows: After the fitness function J is determined, the particle swarm optimization algorithm (PSO for short) is used to solve the optimization problem. The advantage of PSO is that the algorithm is simple and easy to implement, and the algorithm does not have many parameters to be adjusted. At present, PSO has been widely used in function optimization, neural network training, fuzzy system control and other fields, and achieved good results. In this paper, the optimal feature subset is obtained based on PSO algorithm, which is referred to as PSO-FS. The pseudo-code is as follows: Algorithm PSO-FS r=1; //r is the number of cycles for each particle i Initialize Vi; // Initializes the particle velocity vector Initialize Xi; // Initializes the particle position vector pbesti= Xi// Initializes the locally optimal location end for J(gbest) = min{ J(pbesti)}; // Initializes the global optimal location Where, Vi is the velocity of particle i; pbesti is the historical optimal position of particle i; gbest is the global optimal position; Xi is the position of particle i; w is the inertia parameter, usually w =0.8; c1 and c2 are the learning factor, usually c1=c2=2; rand() generates a random number between 0 and 1.
The algorithm terminates automatically when one of the two conditions is met: 1) the number of iterations exceeds 500; 2) the minimization criteria are met. The minimization criterion refers to the change of the global optimal fitness produced by two successive iterations is less than the threshold value . The minimization criterion condition is expressed as |J(gbestr+1)-J(gbestr)|<  , and gbestr is the global position generated in the rth iteration. When the condition is satisfied, the algorithm terminates. In this article  =0.001.

Results and Discussions
In order to verify the effect of the hyperspectral image classification method based on the feature selection of particle swarm optimization (PSO-FS), experiments were carried out on the real hyperspectral data set in this paper. The data set is hyperspectral data of different crops collected by the AVIRIS imaging spectrometer in a field in northwestern Indiana. A total of 220 band images were collected from the data set. Fig.4 is the image of the 25th band of the data set. Each band image size is 145  145 pixels. Among the 220 bands, 20 noise bands (104-108,150-163,220) were discarded after pre-processing, and the remaining 200 bands were optimized by particle swarm optimization algorithm. The image contains a total of 16 kinds of land cover information. In this section, two experiments were conducted to verify the effectiveness of the new method. The first group of experiments was to compare the optimal features with the candidate features and the redundant features. The second group of experiments compared the optimal features with other feature selection methods.
Experiment 1: Feature selection divides candidate feature set B into two subsets: optimal feature set Q and redundant feature set R. In order to verify the effectiveness of feature optimization results, training samples and test samples of three feature sets were obtained, and SVM was used to classify pixels on three sets. The experimental results are shown in table 1: It can be seen from table 2 that: 1) the classification accuracy of hyperspectral pixel classification with candidate feature set B is not high, because the redundant band improves the dimension of the classification space, and the lack of labeled training samples leads to the "dimensional disaster", which reduces the identification accuracy.
2) The classification accuracy of the optimal feature set Q is greatly improved compared with that of candidate feature set B and redundant feature set R.
Experiment 2: in order to test the effectiveness of the algorithm in this paper, two relevant band selection algorithms were selected for comparison: Conv-Deconv algorithm [12] and SPCA-nMI algorithm [5].   Table 2 depicts the classification accuracy comparison of algorithm Conv-Deconv, SPCA-nMI and OPS-FS. Among them, the performance of OPS-FS algorithm is better than other algorithms.

Conclusions and Recommendations
In order to improve the accuracy of hyperspectral image classification and recognition, the first problem to be solved is how to effectively describe the feature representation of pixels. The high dimensionality of hyperspectral data makes it necessary to select the appropriate feature subset from the original feature set to reduce the redundancy of feature. In this paper, the selection of hyperspectral features is transformed into a combinatorial optimization problem, and the fitness function is constructed by using the symmetric uncertainty principle and mutual information theory, and then the particle swarm optimization algorithm is used to solve the optimization problem. The experimental results show that the optimized feature improves the accuracy of hyperspectral image classification.