Granthaalayah

OPTIMIZATION OF THE HIGUCHI METHOD

 

James Wanliss 1Icon

Description automatically generated, R. Hernandez Arriaza 2, G. Wanliss 1, S. Gordon 3

 

1 Department of Physics, Presbyterian College, SC, USA

2 Department of Chemistry and Biochemistry, University of South Carolina, USA

3 Department of Biology, Presbyterian College, SC, USA.

 

 

 

A picture containing logo

Description automatically generated

 

 

 

 

 

 

 

 

 

 

 

 

Received 18 October 2021

Accepted 4 November 2021

Published 30 November 2021

Corresponding Author

James Wanliss, jawanliss@presby.edu

DOI 10.29121/granthaalayah.v9.i11.2021.4393

Funding: This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Copyright: © 2021 The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

 

 

 


 

 

 

ABSTRACT

 

Background and Objective: Higuchi’s method of determining fractal dimension (HFD) occupies a valuable place in the study of a wide variety of physical signals. In comparison to other methods, it provides more rapid, accurate estimations for the entire range of possible fractal dimensions. However, a major difficulty in using the method is the correct choice of tuning parameter (kmax) to compute the most accurate results. In the past researchers have used various ad hoc methods to determine the appropriate kmax choice for their data. We provide a more objective method of determining, a priori, the best value for the tuning parameter, given a particular length data set.

Methods: We create numerous simulations of fractional Brownian motion to perform Monte Carlo simulations of the distribution of the calculated HFD.

Results: Experimental results show that HFD depends not only on kmax but also on the length of the time series, which enable derivation of an expression to find the appropriate kmax for an input time series of unknown fractal dimension.

Conclusion: The Higuchi method should not be used indiscriminately without reference to the type of data whose fractal dimension is examined. Monte Carlo simulations with different fractional Brownian motions increases the confidence of evaluation results.

 

Keywords: Fractal Methods, Numerical Analysis, Signal Processing Methods, Genetics, Bioinformatics

 

1. INTRODUCTION

         The Higuchi algorithm Higuchi (1988) is one of many widely used methods to compute fractal properties of complex nonlinear physical signals Esteller et al. (2001). It is often preferred when big data are analyzed because it is stable, rapid, accurate, relatively low-cost, and excels better known linear methods. We call the fractal dimension calculated via the Higuchi algorithm the Higuchi fractal dimension (HFD).

         The Higuchi algorithm was first applied to turbulence in space plasmas Higuchi (1988) but is applicable to any data generated by complex systems since these tend to exhibit multidimensional fluctuations over many orders of magnitude. The method thus leverages the multiple scaling domains generated by complex systems and derives their characteristic fractal (or multifractal) scaling exponents.

          Over the past several decades the Higuchi algorithm has been extensively used to study pathologies in biological systems. Its utility increases in an era of big data where real-time computation continues to grow in importance Gomolka et al. (2018), Klonowski et al. (2004). The Higuchi algorithm has been successfully employed in numerous medical studies including human gait

 


Doyle et al. (2004), epilepsy Khoa and Toi (2012), disorders Wajnsztejn et al. (2016), Kawe et al. (2019), magnetoencephalograms Gómez et al. (2009), Khoa and Toi (2012), neuro-physiology Kesić and Spasić (2016), electroencephalograms Paramanathan and Uthayakumar (2008), Kawe et al. (2019), Zappasodi et al. (2014), and brain entrainment Shamsi et al. (2021). The Higuchi method generally exceeds the efficiency of well-known linear methods such as the fast Fourier and wavelet transforms. Those methods work well when signals are stationary, but the increasing richness of data that are not only nonstationary but generated by non-equilibrium and noisy processes eliminates their advantages.

The Higuchi algorithm exhibits sensitive dependence on a tuning parameter kmax, defined in the following section. In the original paper Higuchi did not elaborate on the selection of the tuning parameter and for illustration used kmax=211 for time series of length N=217. Subsequent authors used similar values for the tuning parameter but it has been discovered that the tuning parameter plays a crucial role in estimation the HFD. Several studies have addressed the issue of proper selection of tuning parameter kmax. An early paper Affinito et al. (1997) calculated HFD values for kmax=3-10 for time series ranging in length from N=50-1000, settling on kmax=6 as the optimum. The goal was to determine, in their study of electroencephalograms, the most suitable pair of (kmax, N).

Table 1 Use of Higuchi algorithm and parameters

Reference

Topic

N

kmax

N/ kmax

Higuchi (1988)

Space plasmas

217

211

64

Wajnsztejn et al. (2016)

Psychological disorder

1000

10

100

Gomolka et al. (2018)

Heart rate variability

100

5

20

Accardo et al. (1997)

EEG

50-1000

6 (3-10)

8.3-166.7

Doyle et al. (2004)

A/P gait

2400

60

40

Virkkala et al. (2002)

EEG

200

8

25

Klonowski et al. (2004)

Economics

216

15

14.4

Mujiono et al. (2013)

DNA

222

5-100

104-106

Zappasodi et al. (2014)

EEG

2560

16

160

Gómez et al. (2009)

MEG

848

48

17.7

Polychronaki et al. (2010)

Epilepsy

800

2-80

10-400

 

Decades later the literature is unclear on the method of determining the appropriate value of the tuning parameter, usually suggesting that kmax must be deter-mined on a case-by-case basis. Several papers recommend plotting the calculated HFD against a range of kmax, and selecting the appropriate kmax at the location where the calculated HFD approaches a local maximum or asymptote, considered to be a saturation point Doyle et al. (2004), Wajnsztejn et al. (2016). Gomolka et al. (2018) select kmax on the basis of statistical tests that allow discrimination between known healthy and diabetic subjects. Paramanathan and Uthayakumar (2008) proposed to determine kmax based on a size-measure relationship, that employs a recursive length of the signal from different measuring scales. In most cases researchers found that the calculated HFD is not much affected by length of the time-series but depends more strongly on kmax. Therefore, a poorly chosen tuning parameter can severely prejudice results.

Table 1 shows a sample of pairs of (kmax, N) from the literature. There is clearly no established procedure widely accepted for determining the tuning parameter. The plethora of values selected for the tuning parameter makes it clear that the community would benefit from a careful consideration of selection criteria for the tuning parameter, especially with long time-series and big data where it is impractical to examine an almost infinite number of curves of HFD against kmax.

In summary, one of main the difficulties in performing the Higuchi algorithm are that it relies on a tuning parameter, kmax, that in most cases must be selected before the fractal dimension is computed. Our goal in this paper is to explore the optimum sample pairs between the tuning parameter and length of the time series for the most general types of data.

 

2. MATERIALS AND METHODS

The Higuchi algorithm takes a signal and discretizes it into the form of a time series,  From this series we derive a new time series, , defined as follows:

 

where [] represents the integer part of the enclosed value. The integer is the start time;  is the time interval, with  is a free tuning parameter. This means that a given time interval equal to , spawns -sets of new time series. For instance, if  and the time series has length , the following three time series are derived from the original data:

 

 

The length of any one of these curves is given by:

 

 

The length of the curve for the time interval  is then defined as the average over the  sets of

 

 

If this equation scales according to the rule  then the time series behaves as a fractal with dimension D. Thus, the HFD is defined as the slope of the straight line that fits the curve of ln(L(k)) versus ln(1/k).

 

 

 

 

Chart, line chart

Description automatically generated

Figure 1 Sample fBm for HFD between 1.1 and 1.9.

 

The data we experiment on are theoretically derived fractional Brownian motions (fBm). fBm is a continuous-time random process proposed by Mandelbrot and John (1968). A signal that displays fBm is expressed in terms of stochastic integrals of time integrations of fractional Gaussian noise:

 

 

Here W is a white noise process defined on (-∞, ∞), and  is known as the Hurst parameter. The Hurst exponent for the signal is its roughness averaged over many length scales. The covariance function is given by

 

 

so that  and . This means that for the special case H=1/2, fBm reduces to the well-known random walk. The relationship between H and HFD is HFD , with values between 1 and 2. Thus, we are able to use the fBm process as our data source with a well-defined HFD in order to determine how well the Higuchi algorithm is able to accurately recover the theoretical value.

Figure 1 shows sample data of fBm signals for different values of the HFD. We create these data using a wavelet-based synthesis of fBm generation based on a biorthogonal wavelet method proposed by Meyer and Sellan Abry and Sellan (1996), Bardet et al. (2003) implemented in Matlab software. Figure 2 shows examples of how, in the Higuchi algorithm, the slope is calculated from the slope of linear fits for the curve of ln(L(k)) versus ln(1/k) for theoretical values of HFD of 1.3 and 1.7.

 

 

 

 

Chart, scatter chart

Description automatically generated

Figure 2 HFD is defined as the slope of the straight line that fits the curve of ln(L(k)) versus ln(1/k).

 

3. RESULTS AND DISCUSSIONS

One way that is suggested to find the optimal kmax is by plotting the calculated HFD against the kmax, and by selecting the value in the range for which HFD(k) achieves a plateau. This assumes that a plateau is achieved in every case for different classes of HFD. It is not a priori evident that this will be the case since, as is demonstrated in Figure 1, different values for the fractal dimension create time series with different roughness, which will influence the accuracy of the Higuchi algorithm. Those with HFD below 1.5 are smoother curves having clear long-range dependence while curves with higher values are rougher.

Figure 3 shows the error between the theoretical HFD and the value calculated via the Higuchi algorithm, for time series of length N=1,000, and varying values of kmax. In general, for the smallest kmax=2 the persistent time series, with HFD<1.5 produces an underestimate the theoretical fractal dimension, and antipersistent time series an overestimate. What we note here is that not every case shows the HFD estimate reaches a plateau, thus demonstrating that seeking to find the optimum tuning parameter by a plateau is not in general a valid procedure in the Higuchi algorithm. Indeed, for the persistent time series there is no plateau that is achieved. The antipersistent curves do show a clear plateau yet, contrary to general recommendations, selecting kmax near the plateau does not yield the lowest error, which is rather achieved for kmax values far beyond the plateau (i.e., for HFD=1.9, 1.7)

Chart, diagram

Description automatically generated

Figure 3 Error difference between Higuchi algorithm and theoretical HFD values, versus kmax, for different fBm's of length N=1,000

In order to explore how the HFD depends on the tuning parameter we created 100 independent time series with lengths, N, varying between 1,000 and 200,000 data points. Next, for each of these series, we compute the HFD for different values of kmax, allowing kmax to vary between kmax =2 and kmax =N/2. The idea is to produce Monte Carlo simulations from which to derive a distribution of outcomes that can be analyzed. This results in a set of HFD values as a function of the time series length, and the tuning parameter, kmax. Next, we create surface plots comparing the HFD versus the tuning parameter kmax and time series length, N. To best compare the results for various theoretical HFD from the wavelet method, we calculate an error metric, defined as, the percentage error:

 

 

Chart, histogram

Description automatically generated

Figure 4 Surface showing the percentage error between the Higuchi algorithm and theoretical HFD=1.9. The curve of least error is shown as a thick grey line.

This error metric allows one to easily see where the Higuchi algorithm best approximates the correct HFD, and where it produces overestimates or underestimates. Figure 4, Figure 5, Figure 6, Figure 7, Figure 8 shows the surface plots for E(kmax,N) for values of HFD=1.9, 1.7, 1.5, 1.3, 1.1. The grey line overlaid in each figure shows the curves of minimum error, corresponding to the kmax tuning parameter that yields the best correspondence between the theoretical and HFD values for a given N, averaged over 100 unique time series.

Chart

Description automatically generated

Figure 5 Surface showing the percentage error between the Higuchi algorithm and theoretical HFD=1.7. The curve of least error is shown as a thick grey line

Chart

Description automatically generated

Figure 6 Surface showing the percentage error between the Higuchi algorithm and theoretical HFD=1.5. The curve of least error is shown as a thick grey line

 

Chart

Description automatically generated

Figure 7 Surface showing the percentage error between the Higuchi algorithm and theoretical HFD=1.3. The curve of least error is shown as a thick grey line

 

Chart

Description automatically generated

Figure 8 Surface showing the percentage error between the Higuchi algorithm and theoretical HFD=1.1. The curve of least error is shown as a thick grey line

By taking a simple average of these minimum error curves of least error we are able to derive a best-fit curve using a cubic function, shown in Figure 9 as the dashed curve, for the average relationship between the time series length and the tuning parameter, for different HFD values:

 

 

 This function can be used to generate a tuning parameter estimate that can be expected to yield accurate results for the HFD, for cases where N<200,000.

Chart, line chart

Description automatically generated

Figure 9 Comparison of the average minimum error curve (solid) and the best fit cubic function (dashed).

 

4. APPLICATION TO GENETICS

We next demonstrate analysis on physical data, that of a complete genome of a bacterial organism, viz. the proteobacteria Acidovorax avenae which attacks the Cucurbitaceae family. These data are freely available from the National Center for Biotechnology Information (NCBI), which is part of the United States National Library of Medicine (NLM), a branch of the National Institutes of Health (NIH).

Chart, line chart

Description automatically generated

Figure 10 Acidovorax avenae complete genome represented by the Peng Abry and Sellan (1996) method.

In order to analyze hidden patterns in the genome, the nucleotide sequence must be converted from a symbolic sequence, meaning A, G, C, T; to a numeric representation. The Peng method Peng et al. (1992) was followed wherein DNA is represented as a “random walk” with two parameters ruling the direction of the “walk” and the resulting displacement. Starting from the first nucleotide, every time we encounter a pyrimidine base, we move up one position. On the other hand, when a purine base is encountered in the series, we move down one position. The nucleotide distance from the first nucleotide is then plotted versus the displacement, as in Figure 10, Figure 11 shows the computed HFD against tuning parameter kmax from time series of length N=200,000 that are subsets from the original series shown in Figure 10. We took 39 non-overlapping series and computed curves for each of them. Figure 11 shows that the curves (black lines) computed from different subsets of the genome are very similar, which indicates that the fractal dimension does not vary significantly along the genome.

In each case the results show a clear plateau and an HFD<1.5, indicating a persistent time series, though with widely divergent values due to the different tuning parameter.

Chart

Description automatically generated

Figure 11 HFD calculated for varying length kmax between 2 and 100,000 for nonoverlapping time series of length N=200,000 from Acidovorax Avenae. The grey triangle shows where our research indicates the appropriate kmax should be situated.

 

The plateau region is shown with the arrows and, using a plateau as the measure of where to select kmax, suggests the best tuning parameter should be in the range kmax =10-50, thus yielding HFD~1.49. However, our analysis, given time series length N=200,000, recommends a value around kmax =1,000 (grey triangle) which yields HFD~1.42. This difference allows one to easily see where the Higuchi algorithm best approximates the correct HFD, and where it produces over- or underestimates.

 

5. CONCLUSIONS AND RECOMMENDATIONS

The Higuchi algorithm Higuchi (1988) is one of many widely used methods to compute fractal properties of complex nonlinear physical signals from a wide variety of research areas Esteller et al. (2001), Wanliss  and  Reynolds (2003), Mitsutake et al. (2004), Cersosimo and Wanliss (2007). Over the past decades the Higuchi algorithm has been extensively used to study pathologies in biological systems and to discriminate between healthy signals and those displaying pathologies. It is often preferred when large amounts of data are analyzed because it is stable, rapid, accurate, relatively low-cost, and excels better known linear methods for study fractal properties of a time series. However, the method requires the user to input a free tuning parameter, kmax., the selection of which influences computational efficiency of the algorithm and, more importantly, the value of the computed HFD. Thus, all the benefits of the algorithm can be negated by poor selection of the tuning parameter. Different values of kmax can produce widely divergent estimates for the HFD from the same time series, thus it is imperative to have a method for appropriately determining the best tuning.

It has been suggested (e.g., by Doyle et al. (2004) and Wajnsztejn et al. (2016)) Doyle et al. (2004), Wajnsztejn et al. (2016) that the best kmax to derive the most precise HFD should be where the calculated HFD (kmax) approaches a local maximum or asymptote, considered to be a saturation point. However, as we have demonstrated in Figure 3, it is not a given that any physical data will result in such a plateau or saturation point. This methodology can produce spurious results and can be a time consuming, iterative process.

In this paper we have explored Monte Carlo computer realisations of wavelet derived fBm time series, with known HFD. We have demonstrated that calculation of an accurate HFD depends not only on the appropriate kmax but is also dependent on time series length, N. We have calculated a relationship that determines the appropriate kmax to obtain the best HFD, given the time series length. We find that a third order polynomial will yield an appropriate kmax given the particular time series length, N. In general, persistent time series, with HFD<1.5, tended to need smaller kmax than antipersistent series.

In a future work we will extend these results to larger data sets and explore the effects from simulations of synthetic synthetic time series with known fractal properties, to see how well the method performs, and how it is influenced by the generation algorithms.

 

ACKNOWLEDGEMENTS

This work was supported by the US National Science Foundation (Grant No. AGS-2053689) and SC-INBRE.

 

REFERENCES

Abry, P. and Sellan, F., (1996). The Wavelet-Based Synthesis for Fractional Brownian Motion Proposed by F. Sellan and Y. Meyer: Remarks and Fast Implementation. Applied and Computational Harmonic Analysis, 3(4), pp.377-383. Retrieved from https://doi.org/10.1006/acha.1996.0030

Affinito, M., Carrozzi, M., Accardo, A. and Bouquet, F., (1997). Use of the fractal dimension for the analysis of electroencephalographic time series. Biological Cybernetics, 77(5), pp.339-350. Retrieved from https://doi.org/10.1007/s004220050394

Bardet, J.-M.; G. Lang, G. Oppenheim, A. Philippe, S. Stoev, M.S. Taqqu (2003), "Generators of long-range dependence processes: a survey," Theory and appli-cations of long-range dependence, Birkhäuser, 579-623.

Cersosimo, D. O., and J. A. Wanliss (2007), Initial studies of high latitude magnetic field data during different magnetospheric conditions, Earth Planets Space, 59(1), 39- 43. Retrieved from https://doi.org/10.1186/BF03352020

Doyle, T., Dugan, E., Humphries, B. and Newton, R., (2004). Discriminating between elderly and young using a fractal dimension analysis of centre of pressure. International Journal of Medical Sciences, pp.11-20. Retrieved from https://doi.org/10.7150/ijms.1.11

Esteller, R., Vachtsevanos, G., Echauz, J. and Litt, B., (2001). A comparison of waveform fractal dimension algorithms. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, 48(2), pp.177-183. Retrieved from https://doi.org/10.1109/81.904882

Gomolka, R., Kampusch, S., Kaniusas, E., Thürk, F., Széles, J. and Klonowski, W., (2018). Higuchi Fractal Dimension of Heart Rate Variability During Percutaneous Auricular Vagus Nerve Stimulation in Healthy and Diabetic Subjects. Frontiers in Physiology, 9. Retrieved from https://doi.org/10.3389/fphys.2018.01162

Gómez, C., Mediavilla, Á., Hornero, R., Abásolo, D. and Fernández, A., (2009). Use of the Higuchi's fractal dimension for the analysis of MEG recordings from Alzheimer's disease patients. Medical Engineering & Physics, 31(3), pp.306-313. Retrieved from https://doi.org/10.1016/j.medengphy.2008.06.010

Higuchi, T., (1988). Approach to an irregular time series on the basis of the fractal theory. Physica D: Nonlinear Phenomena, 31(2), pp.277-283. Retrieved from https://doi.org/10.1016/0167-2789(88)90081-4

Kawe, T., Shadli, S. and McNaughton, N., (2019). Higuchi's fractal dimension, but not frontal or posterior alpha asymmetry, predicts PID-5 anxiousness more than depressivity. Scientific Reports, 9(1). Retrieved from https://doi.org/10.1038/s41598-019-56229-w

Kawe, T.N.J., Shadli, S.M. & McNaughton, N. (2019) Higuchi's fractal dimension, but not frontal or posterior alpha asymmetry, predicts PID-5 anxiousness more than depressivity. Sci Rep 9, 19666. https://doi.org/10.1038/s41598-019-56229-w Retrieved from https://doi.org/10.1038/s41598-019-56229-w

Kesić, S. and Spasić, S., (2016). Application of Higuchi's fractal dimension from basic to clinical neurophysiology: A review. Computer Methods and Programs in Biomedicine, 133, pp.55-70. Retrieved from https://doi.org/10.1016/j.cmpb.2016.05.014

Khoa, T., Ha, V. and Toi, V., (2012). Higuchi Fractal Properties of Onset Epilepsy Electroencephalogram. Computational and Mathematical Methods in Medicine, 2012, pp.1-6. Retrieved from https://doi.org/10.1155/2012/461426

Klonowski, W., Olejarczyk, E. and Stepien, R., (2004). 'Epileptic seizures' in economic organism. Physica A: Statistical Mechanics and its Applications, 342(3-4), pp.701-707. Retrieved from https://doi.org/10.1016/j.physa.2004.05.045

Mandelbrot, Benoit B., and John W. Van Ness. (1968) "Fractional Brownian Motions, Fractional Noises and Applications." SIAM Review, vol. 10, no. 4, Society for Industrial and Applied Mathematics, pp. 422-37. Retrieved from https://doi.org/10.1137/1010093

Mitsutake G, Otsuka K, Oinuma S, Ferguson I, Cornélissen G, Wanliss J, Halberg F (2004) Does exposure to an artificial ULF magnetic field affect blood pressure, heart rate variability and mood? Biomed Pharmacother 58:S20-S27. Retrieved from https://doi.org/10.1016/S0753-3322(04)80004-0

Paramanathan, P. and Uthayakumar, R., (2008). Application of fractal theory in analysis of human electroencephalographic signals. Computers in Biology and Medicine, 38(3), pp.372-378. Retrieved from https://doi.org/10.1016/j.compbiomed.2007.12.004

Peng, C., Buldyrev, S., Goldberger, A., Havlin, S., Sciortino, F., Simons, M. and Stanley, H., (1992). Long-range correlations in nucleotide sequences. Nature, 356(6365), pp.168-170. Retrieved from https://doi.org/10.1038/356168a0

Shamsi, Elham, Mohammad Ali Ahmadi-Pajouh, Tirdad Seifi Ala, (2021) Higuchi fractal dimension: An efficient approach to detection of brain entrainment to theta binaural beats, Biomedical Signal Processing and Control, Volume 68, 102580. Retrieved from https://doi.org/10.1016/j.bspc.2021.102580

Wajnsztejn, R., Carvalho, T., Garner, D., Raimundo, R., Vanderlei, L., Godoy, M., Ferreira, C., Valenti, V. and Abreu, L., (2016). Higuchi fractal dimension applied to RR intervals in children with Attention Defi cit Hyperactivity Disorder. Journal of Human Growth and Development, 26(2), p.147. Retrieved from https://doi.org/10.7322/jhgd.119256

Wanliss, J. A., and M. A. Reynolds (2003), Measurement of the stochasticity of low-latitude geomagnetic temporal variations, Ann. Geophys., 21, 2025. Retrieved from https://doi.org/10.5194/angeo-21-2025-2003

Zappasodi F, Olejarczyk E, Marzetti L, Assenza G, Pizzella V, Tecchio F (2014) Fractal Dimension of EEG Activity Senses Neuronal Impairment in Acute Stroke. PLoS ONE 9(6): e100199. Retrieved from https://doi.org/10.1371/journal.pone.0100199       

 

 

Creative Commons Licence This work is licensed under a: Creative Commons Attribution 4.0 International License

© Granthaalayah 2014-2021. All Rights Reserved.