Validate Cross Correlation, Part 3
30 Aug 2015In the previous post we showed how cross-correlation could be used to find the time delay between identical and very simple functions. Now we want to explore what happens when one of the signals has some noise.
In the last post we were considering two simple triangular signals A and B, with B delayed some 13 microseconds from A.
We now modify B by adding some 5% noise to it:
And compute the cross-correlation with A:
We can also check the value of this cross correlation:
No changes! The cross-correlation can cope with a small amount of noise without problems. To finalize the examples with triangular functions we add a lot of noise to the signal:
And once more we compute the cross-correlation:
And obtain basic statistics about the cross-correlation values:
Once more, there are no changes to the estimate! The cross correlation can deal with uniform noise without problems.
Quotes and Square functions
So far we have been using triangular functions because they were easy to generate. Market signals more closely resemble square functions: a quote value is valid until it changes. Moreover, market data is not regularly sampled in time. One might receive no updates for several milliseconds, and then receive multiple updates in the same microsecond! But to illustrate how this would work we can make our life easy. Suppose we have the best bid quantity sampled every microsecond, and it had the following values:
We use a similar trick as before to create a time shifted version of this signal, and add some noise to it:
And as before we can compute the cross-correlation:
And obtain basic statistics about the cross-correlation values:
One problem is that the different between the peak and the minimum is not that high, in relative terms it is only 0.7%.
Conclusion
In these last three posts we have reviewed how cross-correlations work for simple triangular functions, triangular functions with some noise and finally for step functions with noise. We observed that some FFT libraries avoid computation by not rescaling, which can present problems interpreting the results. We also observed that the result of the cross-correlation is a measure of area, which can have very large values for some functions and it would also be desirable to rescale.