We were talking about correlation all this time without first knowing what it actually means. So lets proceed. Given
a random sequence of two sets of numbers (1 5 2) and (2 5 3) correlation finds the similarity in the sequence. The value
obtained will be maximum when the two of them are equal. How to perform correlation?
Find the sum of overlapping products and
normalize the value with the product of the square root of the sum of the squares of the individual sequence. Sounds too complex,
try with the above example-
val = (1*2+5*5+2*3) = 33.
normalizing value = sqrt(1*1+5*5+2*2)*sqrt(2*2+5*5+3*3).
= 33.76
ans = 33/33.76 < 1
Suppose the two
sequences were the same you would get a value equal to one. Hence more different the two sequences are lesser will
be the correlated value. Work around with different sequences to to know their influence on the value obtained.
Enough of math, lets come back to its application in vision.
It is generally said that disparity can measure depth. I don't know what others mean, I will give my view clearly here and
tie triangulation, correlation, convergence and disparity all with a single knot.
Our eyes being
at an offset will give a slightly different view of the scene. The offset depends on the distance of the object from us. Lesser
the distance greater the offset. But it is this disparity that is allowing the eyes to converge to a particular object
through correlation. The consequence of all these is triangulation. Depth will be perceived if the overlapping succeeds in
a successful correlation, or else there is suppression. Hence when the object of interest is being seen there is no way disparity
comes in since the images are overlapped. Disparity will be observed in objects at a different depth than being seen, and
when you see this again there is no disparity, since now the image is overlapped at this region.