Motion Perception - Elia Torre

### Selectivity for Stimulus Orientation and Direction ![[ETH/ETH - Computational Vision/Images - ETH Computational Vision/image130.png]] Hubel and Wiesel performed single neurons recording of cats V1 neurons. On top of what we already discussed, i.e., orientation selectivity, this image shows us that certain neurons show selectivity for the direction of motion of the stimulus. In primates this selectivity arises in the cortex, but certain species feature this selectivity also in the retina. ### Reichardt Detector ![[ETH/ETH - Computational Vision/Images - ETH Computational Vision/image131.png]] This is a computational model or neural circuit designed to identify motion or movement in a visual scene by computing the relative change in intensity across space and time. It compares the output of two spatially separated photoreceptor neurons, which are sensitive to light intensity changes. These neurons have their outputs delayed and multiplied, and the final output is a measure of correlation between the intensity changes at the two spatial locations. If the change an intensity at one location closely follows the change in intensity at the other location, with a small delay, the detector signals motion in a specific direction. ### Visual Motion: 2D **The "Aperture Problem"** ![[ETH/ETH - Computational Vision/Images - ETH Computational Vision/image132.png]] This is a fundamental issue in the perception of motion that arises due to the limited spatial extent of the receptive fields of visual neurons, such as those in the primary visual cortex (V1). When a neuron's receptive field is limited or "apertured", it can only "see" a small portion of the visual scene, making it challenging to accurately determine the true motion direction of an object or pattern. For example, in the barber pole illusion we have diagonal stripes on a vertically rotating cylinder that appear to move vertically rather than diagonally, which is their true motion direction. The illusion occurs due to the limited spatial extent of V1 neurons' receptive fields and their sensitivity to specific orientations. When observing the barber pole, V1 neurons with receptive fields that match the orientation of the diagonal stripes are activated. However, because of their small receptive fields, they only "see" a small portion of the stripes and cannot determine the true motion direction. Instead, they detect motion along the stripes' orientation, which is perpendicular to the stripes. At the same time, the edges of the cylinder constrain the visible motion of the stripes to the vertical direction. This vertical motion is consistent with the motion detected by V1 neurons along the stripes' orientation, and it becomes the dominance percept. **The "Aperture Problem" in the Visual System** ![[ETH/ETH - Computational Vision/Images - ETH Computational Vision/image133.png]] This is problem is relevant because our visual system is "seeing" the world through a bunch of small apertures. In the experiment above, it was investigated how the visual system resolves the aperture problem by studying the responses of direction-selective MT neurons in the macaque monkey. Their findings demonstrated that MT neurons can integrate local motion signals from V1 neurons to determine the true global motion direction and that they are highly sensitive to coherent motion, even in noisy visual stimuli. Referring to the picture, we can notice that by using only the receptive field of a single V1 cell we are not sure regarding the velocity of movement of the object and multiple velocities are possible, however integrating over two V1 receptive fields allows us to disambiguate and identify the single velocity vector representing the global motion. Referring to the picture on the right: the left-hand diamond moves to the right; the right-hand diamond moves down. Note that in both cases, in the local region indicated on each diamond by the small circle, the border moves downward and to the right. The moving edge (under the "diamonds") which could represent a magnified view of the circled regions of the diamonds' borders can be generated by any of the motions shown by the arrows. Motion parallel to the edge is not visible, so all motions that have the same component of motion normal to the edge are possible candidates for the "true" motion giving rise to the observed motion of the edge. We may map this set of possible motions as a locus in "velocity space". ![[ETH/ETH - Computational Vision/Images - ETH Computational Vision/image134.png]] **Intersection of Constraints**: This concept summarizes what we have been discussing: each neuron sensitive to motion provides a constraint on the possible motion direction of an object or pattern in the visual scene, based on the neuron's preferred orientation. However, due to the aperture problem, the motion information provided by a single neuron is ambiguous. By combining the constraints from multiple neurons with different preferred orientations, the visual system can determine the true motion direction where the constraints intersect. ### Area MT - Responses to Moving Stimuli ![[ETH/ETH - Computational Vision/Images - ETH Computational Vision/image135.png]] **Perceived Direction of Gratings and Plaids** ![[ETH/ETH - Computational Vision/Images - ETH Computational Vision/image136.png]] We can construct a stimulus that incorporates the aperture problem. For example, gratings are regular patterns of alternating light and dark bars, while plaids are formed by superimposing two or more gratings of different orientations. In the context of the aperture problem, gratings and plaids can be used to demonstrate the limitation of local motion processing. For instance, a grating moving behind a circular aperture will produce an ambiguous motion signal, as V1 neurons with small receptive fields will only detect motion along the grating's orientation. In the case of Plaids, there is no ambiguity about the motion of the whole pattern, since the two families of possible velocities (shown by the dotted lines) intersect at a single point. **Component and Pattern Direction Selectivity** ![[ETH/ETH - Computational Vision/Images - ETH Computational Vision/image137.png]] The figure above illustrates the response of a hypothetical direction selective neuron. In each plot the direction of motion of the stimulus is given by the angle, and the response of the cell to that direction is given by the distance of the point from the origin. The left-hand plot reveals that this "neuron" responded best to gratings moving directly rightward and did not respond to leftward motion. The direction tuning curve for a single grating therefore has a single peak corresponding to the best direction of motion. When one component of a 90 degree plaid (one whose components are oriented at 90 degrees to one another) is within the direction bandwidth of the neuron, the other component will be outside the acceptable range. If the neuron is component direction selective, the predicted direction tuning curve to a plaid then, is the sum of the responses to the two components presented separately. Before the responses are added, however, any spontaneous firing rate (here zero) is subtracted from each. After the two responses are added, the spontaneous rate is added back in. In the right-hand plot, responses are plotted as a function of the direction of motion of a plaid. When the plaid is moving in the optimal direction (as determined with a single grating), the components will be oriented 45 degrees to either side of the optimum. Thus the response peaks are also shifted to either side by 45 degrees, and the predicted tuning curve for the plaid is a bi-lobed curve whose peaks straddle the single peak derived from the single grating experiment. This prediction is shown by solid lines in the right-hand plot. The prediction for pattern direction selectivity is even simpler: the neuron's tuning curves for the two stimuli should be similar since their directions of motion are the same. The predicted tuning curve is thus simply the curve derived from the single grating experiment, and is shown by dashed lines in the right-hand plot. The basis of this test is to dissociate the oriented components of a pattern from the direction in which they move: a single grating always moves at right angles to its orientation, but the plaids move at a different angle to their oriented components (+5 deg in the figure). The two predictions for the different types of direction selectivity are radically different and one may simply see whether the neuron's response depends on the overall direction of motion, or on the orientation of the moving components. **Responses of a V1 Cell** ![[ETH/ETH - Computational Vision/Images - ETH Computational Vision/image138.png]] In the figure: directional selectivity of a special complex cell recorded in area 17 of a cat. On the left is shown the neuron's tuning for the direction of motion of single gratings, and on the right is shown the neuron's response to moving 90 deg plaids. The dashed curve on the right shows the expected response of a component direction selective neuron. The inner circles in each plot show the neuron's maintained discharge level. **Responses of two MT Cells** ![[ETH/ETH - Computational Vision/Images - ETH Computational Vision/image139.png|600]] This figure shows data, in a format similar to the figure before, for two neurons recorded from MT. The neuron in the first "row" preferred upward movement of single gratings, like its component direction selective counterpart in V1 (previous image), this preference was translated into a dual preference for two directions 45 deg apart when it was tested with 90 deg plaids. As comparison of the data with the dashed lines in the right-hand plot of the first line reveals, the component direction selective prediction provided a very good description of this behavior. About 40% of the cells we studied in MT were clearly component direction selective. The second line of the figure shows data from a neuron in MT whose behavior was rather different. This neuron preferred downward and rightward movement of grating stimuli, and maintained this preference when tested with 135 deg plaids. The actual response to plaids differed very dramatically from the component direction selective prediction. **Population Analysis** ![[ETH/ETH - Computational Vision/Images - ETH Computational Vision/image140.png]] In order to examine the distribution of behavior of neurons in different areas, this figure shows scatter diagrams in which the values of the pattern and component correlation coefficients were plotted against one another. The first subplot illustrates the significance of various regions of these plots. The region marked "component" is a zone in which the component correlation coefficient significantly exceeds either zero or the pattern correlation coefficient, whichever is larger. The region marked "pattern" similarly marks neurons that were unambiguously pattern direction selective. The region marked "unclassed" represents cases in which both pattern and component correlations significantly exceeded zero, but did not differ significantly from one another, or cases in which neither correlation coefficient differed significantly from zero. The middle subplot shows a scatter plot of data in this space for 69 neurons recorded from cat and monkey V1. It is clear that these cluster around a component correlation value of 1 and a pattern correlation value of zero. While a few neurons lie in the two indeterminate regions of the plot, no clearly pattern direction selective cases exist. The left subplot shows a scatter diagram of the directional selectivity of 108 neurons in MT, tested with 135 deg plaids. Most neurons in MT are rather more broadly tuned for direction than their counterparts in V1, and in consequence the distinction between the component and pattern predictions cannot often be made very clearly with 90 deg plaids.