I recently found an algorithm, somewhere in the internet that boasts of finding the slant ness of a wall or a surface using
not pixel based stereo correspondence, but line based stereo correspondence. The basic idea is this. Consider a stereo image*
as shown in fig 1.

|
| Fig 1 |
The width of the rectangle
at the left is smaller than the one on the right. This can be possible when the rectangle is placed in a slant way such that
right vertical side is closer to you than the left. Let us assume that the rectangle is tilted only with respect to the vertical
plane and remains perpendicular to the horizontal plane. But looking at this image stereoscopically is it possible for us
to perceive this slant? At least I am not able to. Try it out and let me know if it is possible for you. Let
me come back to the algorithm. The algorithm takes horizontal lines from one end of the rectangle to the other, in both the
images. The difference in the length of these lines would provide the necessary information to find the amount of tilt. Fine,
I will agree to it, but can this algorithm do something that even my brain fails to achieve? Let me present you with a scenario
when it would fail to give the required results.

|
| Fig 2 |
Fig 2 shows a stereo image, where the rectangles are at the front and the square at the back, if viewed with crossed eyes.
Change the viewing to parallel and the depth gets interchanged. The square comes to the front and the rectangles go to the
back. You may feel that this is pretty obvious, but still have a look at it two or three more times and answer a few questions
that I am going to ask. Firstly, do you find anything joining the edges of the blocks? I mean how the space in between the
blocks is. Is it empty or is it connected? If it is connected, what is connecting them? Mostly you will find the blocks floating
in the air. Let me come to a real scenario of this kind. Assume that you have a very big rectangular wall painted white, at
the center of which a black square is painted. There are two rectangular blocks placed in front of the wall, at whatever depth
you wish. That’s going to end up exactly in the same formation as shown in Fig 2. The relative depths may be different,
but the right view will have a greater gap between the square and the left rectangle and between the square and the right
rectangle the left view will have a greater gap. The problem starts now! I will connect the left edge of the square with the
right edge of the left rectangle and the right edge of the square with the left edge of the right rectangle, say with a wall
and paint it with exactly the same color as the background wall. Take the left and the right images and compare them with
the previous image i.e. Fig 2. What is the difference? Even though you did a lot of work it has actually produced the same
image. By this time you would have got what I am trying to explain. Between the square and the rectangle even though there
is a gap that varies in a way similar to that described by the algorithm that
I started out with, the gap may or may not be a slant. So slant ness cannot be perceived with such techniques. Before discussing
the solution, let me give you one more example.

|
| Fig 3 |
Fig 3 is nothing but Fig 2 itself with a gray connection between the rectangle and the squares. Now looking at only one of
the images, our brain can somehow create a pseudo 3D model of the environment. It assumes that the left rectangle is at the
front most, followed by the square followed by the right rectangle at the back. But try looking at these images stereoscopically.
What do you find? When cross viewed, even though both the left and the right rectangles need to be at the front, due to linear
perspective the right rectangle takes the back seat. If you continuously stare at the image stereoscopically, it would start
behaving like an illusion, with each rectangle taking either the front or the back positions randomly. So is the algo really
working?

|
| Fig 4 |
In Fig 4, the right rectangle has to be in front, but due to the smaller size appears to go back.

|
| Fig 5 |
In Fig 5, again the illusion persists.

|
| Fig 6 |
In Fig 6, the height of
the rectangles is greater than the height of the square. The gray wall connecting the rectangles and the square shows linear
perspective in the correct direction, so even though you don’t see the images stereoscopically, it would show pseudo
depth.
Now the big question is, is stereo correspondence and depth perception
just mathematics? Let me put some points here and explain what might be happening. If our brain perceives 3D based on stereo correspondence, the above illusions
should not occur. Or maybe after applying stereo correspondence, the results are sent out for verification. The verification
round involves using the existing knowledge of the world to perceive what is being calculated. If the existing knowledge of
the world clashes with what is being calculated, you will end up in confusion. But if the existing knowledge of the world
is to be used where does the knowledge come from initially? This is what I call as training. A living system only comes with
the software and not the database, so it has to create it as it grows and learns and so it takes years to master it. For computer
based systems the database can be included with the software, so why train them? But knowledge doesn’t mean an image.
It should have its own format like the other files in your computer. So any robotic system is going to use this knowledge
to verify its calculation and also update its database with some specific things. But if this verification is not used to
alter the calculations in any way, why use it? What I feel is, verification is the one that bridges the gap between calculation
and perception.

|
| Fig 7 |
In Fig 7, let me not reveal the changes made wrt Fig 6. Try observing the image stereoscopically and find it out yourself.
Were you successful? In Fig 7, the central square has actually become a rectangle. Its width is greater in the right view,
compared to the left. In spite of this difference, at least I was not able to find the central wall being sloppy. Let me make
it a bit easier for the brain to perceive it by introducing some linear perspective.

|
| Fig 8 |
In Fig 8, there is difference in the width of the central in the two
images with an introduction of some tapering as well. It’s amazing to see your brain perceives the slant central wall
in this case! So does stereo correspondence alone decide depth? Well, may not be!
Some more tests:

|
| Fig 9 |
In Fig 9, I can only say that the wall is slant wrt the vertical plane, but tilted towards which side is an unknown that you
have to find out using cross stereo. Let me know your results.

|
| Fig10 |
Fig 9 is actually a part of the image shown here in Fig 10. So to get the answers for yourselves, try looking at this image
stereoscopically. It is pretty much clear from the single image itself, but still go ahead and find it out in stereo and enjoy
the depth. Try your luck once again with Fig 9 if you want.

|
| Fig 11 |
In Fig 11, I have kept three stereo images one below the other. Try to find the difference between these stereo pairs. You
will have to stare at the images stereoscopically for some amount of time. Even though it was difficult to perceive the slant
ness when only a single pair of stereo images was presented, it becomes apparent when viewed in Fig 11. This is because here
two of the stereo images are taken such that the scenery appears to be slant and in the other it is parallel to the vertical
plane. So slant ness is always perceived wrt some reference. You can find the amount of slant by comparing it with the parallel
one, but I am not going to tell which one is what.

|
| (a) |

|
| (b) |
Fig 12 Let’s now come back to the rectangle example from
which we started out our discussion. In Fig 12, I have presented both filled and outlined rectangles separately. The difference
between this and Fig 1 is that, now we can perceive the slant ness of one rectangle wrt the other. By looking at the above
image some guys get a doubt as to whether they are looking at depth or slant, i.e. are the rectangles one behind the other
or are they attached at just one end? To bring out this difference, I have the next image.

|
| Fig 13 (a) |

|
| Fig 13 (b) |
In Fig 13(a), the rectangles are of the same size, but are kept at some offset in one of the images, so here you see that
one of the rectangles is behind the other. In Fig 13(b), the lower rectangle in one of the images is smaller in width compared
to the other, so here you see it slant. Next, I have a last example to show you the role reference has to play to perceive
slant.

|
| Fig 14 (a) |
In Fig 14(a), the dotted strip is actually slanted, but cannot be perceived due to lack of reference. In Fig 14(b), a black
strip, which is equal in its size in both the images, is being placed to give the strip a reference so its slant is perceived.
See to it that you cover Fig 14(b), while viewing Fig 14(a). Initially I thought that the markings are going to help me to
perceive the slant very easily, that’s why the dots were kept on the strip. But to my surprise it takes a lot of concentration
to perceive the slant even with the dots. So even though your brain might be measuring the slant in Fig 14(a), it does not
perceive it unless you concentrate a lot! The addition of a reference makes the job very easy to it.

|
| Fig 14 (b) |
* Stereo image would implicitly mean cross
stereo until unless specified.
|
|