Revealing Stereo And 3D

Perception of Slant Contnd.........
Home | Downloads | DCB | Store | Boot | Xprmnts | Intro | Bib | Motion | Focus | Sgmnt | Rcgntn | Depth | Build | Fail
        I recently found an algorithm, somewhere in the internet that boasts of finding the slant ness of a wall or a surface using not pixel based stereo correspondence, but line based stereo correspondence. The basic idea is this. Consider a stereo image* as shown in fig 1.

slant.jpg
Fig 1

        The width of the rectangle at the left is smaller than the one on the right. This can be possible when the rectangle is placed in a slant way such that right vertical side is closer to you than the left. Let us assume that the rectangle is tilted only with respect to the vertical plane and remains perpendicular to the horizontal plane. But looking at this image stereoscopically is it possible for us to perceive this slant? At least I am not able to. Try it out and let me know if it is possible for you.

        Let me come back to the algorithm. The algorithm takes horizontal lines from one end of the rectangle to the other, in both the images. The difference in the length of these lines would provide the necessary information to find the amount of tilt. Fine, I will agree to it, but can this algorithm do something that even my brain fails to achieve? Let me present you with a scenario when it would fail to give the required results.

push_back.jpg
Fig 2

        Fig 2 shows a stereo image, where the rectangles are at the front and the square at the back, if viewed with crossed eyes. Change the viewing to parallel and the depth gets interchanged. The square comes to the front and the rectangles go to the back. You may feel that this is pretty obvious, but still have a look at it two or three more times and answer a few questions that I am going to ask. Firstly, do you find anything joining the edges of the blocks? I mean how the space in between the blocks is. Is it empty or is it connected? If it is connected, what is connecting them? Mostly you will find the blocks floating in the air. Let me come to a real scenario of this kind. Assume that you have a very big rectangular wall painted white, at the center of which a black square is painted. There are two rectangular blocks placed in front of the wall, at whatever depth you wish. That’s going to end up exactly in the same formation as shown in Fig 2. The relative depths may be different, but the right view will have a greater gap between the square and the left rectangle and between the square and the right rectangle the left view will have a greater gap. The problem starts now! I will connect the left edge of the square with the right edge of the left rectangle and the right edge of the square with the left edge of the right rectangle, say with a wall and paint it with exactly the same color as the background wall. Take the left and the right images and compare them with the previous image i.e. Fig 2. What is the difference? Even though you did a lot of work it has actually produced the same image. By this time you would have got what I am trying to explain. Between the square and the rectangle even though there is a gap that varies in a way similar to that described by  the algorithm that I started out with, the gap may or may not be a slant. So slant ness cannot be perceived with such techniques. Before discussing the solution, let me give you one more example.

push_back_wall.jpg
Fig 3

        Fig 3 is nothing but Fig 2 itself with a gray connection between the rectangle and the squares. Now looking at only one of the images, our brain can somehow create a pseudo 3D model of the environment. It assumes that the left rectangle is at the front most, followed by the square followed by the right rectangle at the back. But try looking at these images stereoscopically. What do you find? When cross viewed, even though both the left and the right rectangles need to be at the front, due to linear perspective the right rectangle takes the back seat. If you continuously stare at the image stereoscopically, it would start behaving like an illusion, with each rectangle taking either the front or the back positions randomly. So is the algo really working?

push_back_line.jpg
Fig 4

        In Fig 4, the right rectangle has to be in front, but due to the smaller size appears to go back.

push_back_line_wall.jpg
Fig 5

In Fig 5, again the illusion persists.

push_back_line_wall1.jpg
Fig 6

        In Fig 6, the height of the rectangles is greater than the height of the square. The gray wall connecting the rectangles and the square shows linear perspective in the correct direction, so even though you don’t see the images stereoscopically, it would show pseudo depth.

Now the big question is, is stereo correspondence and depth perception just mathematics? Let me put some points here and explain what might be happening.

If our brain perceives 3D based on stereo correspondence, the above illusions should not occur. Or maybe after applying stereo correspondence, the results are sent out for verification. The verification round involves using the existing knowledge of the world to perceive what is being calculated. If the existing knowledge of the world clashes with what is being calculated, you will end up in confusion. But if the existing knowledge of the world is to be used where does the knowledge come from initially? This is what I call as training. A living system only comes with the software and not the database, so it has to create it as it grows and learns and so it takes years to master it. For computer based systems the database can be included with the software, so why train them? But knowledge doesn’t mean an image. It should have its own format like the other files in your computer. So any robotic system is going to use this knowledge to verify its calculation and also update its database with some specific things. But if this verification is not used to alter the calculations in any way, why use it? What I feel is, verification is the one that bridges the gap between calculation and perception.

push_back_line_wall_slnt.jpg
Fig 7

        In Fig 7, let me not reveal the changes made wrt Fig 6. Try observing the image stereoscopically and find it out yourself. Were you successful? In Fig 7, the central square has actually become a rectangle. Its width is greater in the right view, compared to the left. In spite of this difference, at least I was not able to find the central wall being sloppy. Let me make it a bit easier for the brain to perceive it by introducing some linear perspective.

push_back_line_wall_slnt1.jpg
Fig 8

In Fig 8, there is difference in the width of the central in the two images with an introduction of some tapering as well. It’s amazing to see your brain perceives the slant central wall in this case! So does stereo correspondence alone decide depth? Well, may not be!

 

Some more tests:

room_crss_stereo2a.jpg
Fig 9

        In Fig 9, I can only say that the wall is slant wrt the vertical plane, but tilted towards which side is an unknown that you have to find out using cross stereo. Let me know your results.

room_crss_stereo1.jpg
Fig10

        Fig 9 is actually a part of the image shown here in Fig 10. So to get the answers for yourselves, try looking at this image stereoscopically. It is pretty much clear from the single image itself, but still go ahead and find it out in stereo and enjoy the depth. Try your luck once again with Fig 9 if you want.

mixed_slant.jpg
Fig 11

        In Fig 11, I have kept three stereo images one below the other. Try to find the difference between these stereo pairs. You will have to stare at the images stereoscopically for some amount of time. Even though it was difficult to perceive the slant ness when only a single pair of stereo images was presented, it becomes apparent when viewed in Fig 11. This is because here two of the stereo images are taken such that the scenery appears to be slant and in the other it is parallel to the vertical plane. So slant ness is always perceived wrt some reference. You can find the amount of slant by comparing it with the parallel one, but I am not going to tell which one is what.

fins.jpg
(a)

rect1.jpg
(b)

Fig 12

        Let’s now come back to the rectangle example from which we started out our discussion. In Fig 12, I have presented both filled and outlined rectangles separately. The difference between this and Fig 1 is that, now we can perceive the slant ness of one rectangle wrt the other. By looking at the above image some guys get a doubt as to whether they are looking at depth or slant, i.e. are the rectangles one behind the other or are they attached at just one end? To bring out this difference, I have the next image.

depth1.jpg
Fig 13 (a)

crssed1.jpg
Fig 13 (b)

        In Fig 13(a), the rectangles are of the same size, but are kept at some offset in one of the images, so here you see that one of the rectangles is behind the other. In Fig 13(b), the lower rectangle in one of the images is smaller in width compared to the other, so here you see it slant. Next, I have a last example to show you the role reference has to play to perceive slant.

slant_dots1.jpg
Fig 14 (a)

        In Fig 14(a), the dotted strip is actually slanted, but cannot be perceived due to lack of reference. In Fig 14(b), a black strip, which is equal in its size in both the images, is being placed to give the strip a reference so its slant is perceived. See to it that you cover Fig 14(b), while viewing Fig 14(a). Initially I thought that the markings are going to help me to perceive the slant very easily, that’s why the dots were kept on the strip. But to my surprise it takes a lot of concentration to perceive the slant even with the dots. So even though your brain might be measuring the slant in Fig 14(a), it does not perceive it unless you concentrate a lot! The addition of a reference makes the job very easy to it.

dots_slant_ref1.jpg
Fig 14 (b)

* Stereo image would implicitly mean cross stereo until unless specified.
 
<<Prev    Next>>
 
Here are some links to my photography gallaries: