Correlation Filter | |
---|---|

Description |
Allows $O(n^s)$ data comparison for correlation data. |

Threaded |
Yes |

Memory Usage |
Extreme - $O (n^s)$ without pruning |

Stream Support |
Yes |

Added In |
0.3.0 |

The correlation filter allows users to compare inputs in a stream against other inputs in the stream using an equation. That is, the correlation filter provides a way to check every single element of a stream in the input against every element of another stream of the input. However, it also provides a method for computing a stream against itself, or comparing any number of streams, assuming that the computational overhead associated with such a large number of comparisons is feasible on your hardware.

# Basic Features

The correlation filter has four main panels: streams, values, parameters, and pruning. The streams panels allows you to define which streams will be used, the values panel allows you to set which values will appear in the output (including equational support), the parameters panel allows you to include additional parameters in the output, and the pruning panel allows you to provide a heuristic for data analysis to reduce the number of data comparisons done.

## Streams

The streams panel is fairly straight-forward. It holds a list of all of the streams in the incoming data source, and allows you to define how many times each will be used in a comparison. Note that the streams will be compared in the order in which they appear on this list - that is, iterations will be nested according to the order that the streams appear here. For example, set Stream 0 to 1 use, and Stream 1 to 2 uses. In this case, variables starting with vi will come from stream 0, variable starting with vj will come from the outer iteration of stream 1, and vk will come from the inner iteration of stream 1. The iteration will be:

```
for (i in stream 0)
for (j in stream 1)
for (k in stream 1)
execute equation
```

For a further example, if you were to use Stream 0 twice, Stream 1 twice, and Stream 2 once, the variables would start with vi, vj, vk, vl, and vm, and would iterate as follows:

```
for (i in stream 0)
for (j in stream 0)
for (k in stream 1)
for (l in stream 1)
for (m in stream 2)
execute equation
```

### Note

There is a known bug in the Stream screen - to store a new stream value after entering it, be sure to press [enter] before clicking out of it.

## Custom Equation Parsing

The Correlation Filter implements a unique and powerful equation parsing system. Normally, any equation entered and parsed in SwiftVis defaults to Stream 0 - there isn't much of a way to specifically specify streams, and the ones that are in place do not fit our plans for iterative comparison very well. On this grounds, the Correlation Filter uses a custom syntax to indicate which stream each piece of data is taken from when defining output.

To properly explain this, assume we have a single stream, Stream 0, and we have indicated that it will be used twice. The code structure of this iteration will look something like

```
for (i in stream 0)
for (j in stream 0)
execute equation
```

In order to translate this into SwiftVis, the equation parsers in the Correlation Filter address individual streams using the letters i, j, k, l, m, n, …, z. For example, if we wanted output that summed each possible pair of numbers in Stream 0, we first indicate that Stream 0 will be used twice, and then add a Value on the Values panel with the Expression:

`vi0 + vj0`

This says, for the stream that happens at the ith level, take v[0] from that stream, and for each element in v[0] for the ith stream, sum it with each element in v[0] from the jth stream. To help reinforce this idea, let's look at another example:

We have two input streams, Stream 0 and Stream 1. We would like to sum every pair of values from Stream 0's v[0] and Stream 1's v[0], and divide each pair by Stream 1's v[1] on the same row. Since we only want to use each stream once, we would set both streams' usage to 1. Then, we would add a value to the Values panel with Expression:

`(vi0+vj0)/vj1`

For further examples see the Correlation Filter Tutorial.

## Values

The Values panel in the Correlation Filter is not unlike the Values panel in the Function Filter - it merely provides a list of equational Expressions that define the output of the filter. Mirror Values will mirror all values in all streams currently in use. Note that mirroring the output will result in $O(n^s)$ memory usage, where $n$ is the number of elements and $s$ is the number of streams being used, because it will append every combination of inputs to the output.

This is also the panel where the actual Correlation mathematics should be entered. For example, if we had a set of (x,y) Cartesian coordinates in Stream 0 at v[0] and v[1] and we wanted the correlation of it against itself, we would indicate on the Streams panel that the filter should use Stream 0 twice, and then add a Values entry with Expression:

`sqrt((vi0-vj0)**2+(vi1-vj1)**2)`

## Parameters

The Parameters panel in the Correlation Filter is not unlike the Parameters panel in the Function Filter - it merely provides a list of equational Expressions that define the output of the filter. Mirror Parameters will mirror all parameters in all streams currently in use. Note that mirroring the output will result in $O(n^s)$ memory usage, where $n$ is the number of elements and $s$ is the number of streams being used, because it will append every combination of inputs to the output. Note, also, that, as with all parameters, anything entered as an equational Expression will be truncated to an Integer value. As with the Values panel, equational Expressions are defined using the custom syntax.

## Pruning

While it is not advisable to use large data sets with the Correlation filter due to its nested iteration, proper implementation of the pruning filter may allow you to do controlled, small-scale comparisons across large data sets. The Pruning panel works by taking a few heuristics for sorting the input data, and then builds a spatial tree out of the provided heuristics. Once the tree is built, the value and parameter equations are performed only for pairs that meet the requirements of all of the heuristics provided.

For a given axis, you must define two things: the equational Expression to compute for the axis, and the range limit for comparisons. The equational Expression is a classical-style Expression, meaning it does not use `vi0` syntax. Instead, it reverts to the classic `v[#]` syntax when evaluating input. This is because streams are not considered when building the tree - only specific values. Once the spatial tree is built using axes defined by these Expressions, the range becomes important. When the Values expressions are executed, it is normally the case that every point is checked against every other point. However, when a range is specified, every point is check against every other point within the provided range for the provided equational Expression. If the Expression is v[0] and the range is 4, each point will only be checked against every other point for which |v[0] - v'[0]| < 4, where v[0] is the initial point and v'[0] is the point that may be checked against it.

# Tutorials Including Correlation Filter

# Other Applications

Aside from correlation, this filter may be used to mathematically approach mutiple inputs. For example, you can sum several stream's v[0]s, divide a value in one stream by a value in another, or any number of other data comparison implementations.