Validate Cross Correlation, Part 2
29 Aug 2015If you are unfamiliar with the markets, or how to interpret a market feed as a real function you might want to check the previous post on this topic.
Let’s start with a simple function and apply the fourier transform and its inverse, the R snippets are available in the repository if you prefer to see or modify the code. Here we will break after each bit of code to offer some explanations.
first we write a simple function to generate triangular functions. Nothing fancy really, but will save us time later
using the function we create a triangle
and wrap the triangle in a data.frame(), because ggplot2 really likes data.frame()
ggplot2 generates sensible and good looking plots in most cases, make sure it is loaded
then we can plot the triangle function
next, let’s apply the FFT transform and the inverse to the triangular function
and save this into a new data.frame()
let’s add the new data to the data.frame()
What is going on here? To save computations, the “Fast” Fourier Transform omits rescaling the function by , where is the number of samples. If we apply this rescaling manually things match perfectly
the more or less obvious question is how well does a function correlate to itself, this is easy to compute
we are going to be wrapping these functions in a data.frame() a lot, so let’s create a function for it
and plot the results
Notice that the y-axis is labeled , this is because the result of a cross correlation is a measure of area. That means that as we process data with larger values the cross-correlation will grow with the square of the value too.
Let’s see how the correlation works with a time shifted signal
Let’s see how the cross-correlation looks like, but since we will be doing several correlations, we write a helper function …
The graphs are pretty, but exactly where is the peak?
That is a perfect match, but market (or other) signals are rarely so perfectly match, what happens if we add some noise?