![[ETH/ETH - Computational Vision/Images - ETH Computational Vision/image125.png|500]]
It is an optical instrument that allows to demonstrate the perception of depth in binocular vision. It uses two separate images, each representing the view of a scene from the perspective of one eye, to create the illusion of a single, 3D image when viewed with both eyes.
- Two images are created by capturing or drawing the same scene from two slightly different viewpoints, mimicking the horizontal separation of the left and right eyes. These images, known as stereopairs or stereograms, contain slightly different information, just as the images captured by our eyes do.
- The stereoscope consists of a frame with two mirrors placed at a 90-degree angle to each other, with one mirror in front of each eye. The stereopair images are positioned on either side of the mirrors, facing them.
- When the viewer looks into the mirrors, each eye sees the reflection of one of the images from the stereopair. The mirrors ensure that the images are aligned correctly, with the left eye viewing the left image and the right eye viewing the right image.
- The brain processes the images from each eye separately and then combines them into a single, coherent image. Due to the differencesbetween the images, the brain perceives depth in the scene, creating the illusion of a 3D image.
**The Correspondence Problem in Stereopsis**
![[ETH/ETH - Computational Vision/Images - ETH Computational Vision/image126.png|500]]
Referring to the figure above, it is difficult to establish the relative position of the objects on the two retinas. In order to be sure that a certain point on the left retina matches another specific point on the right retina, we have to assume a distribution of the points. What the image shows at the bottom are all valid distributions for the 4 points based on their representation on the retina. In order to do so, the brain exploits a heuristic approach. The correspondence problem arises because the visual system must search for matching points or features in two different images that contain many similar or repetitive elements. This can be particularly challenging in complex or textured scenes, where many elements might appear similar and finding the correct correspondence becomes more difficult.