In my last post I described a thought experiment to understand some of the roles actions play in perception. The question we asked in the thought experiment was whether our brains could learn spatial concepts and transformations efficiently by passively observing video streams without ever making an action. I think the answer is NO, and I will briefly describe why.

Let us say we have a video camera whose output is fed to a very smart algorithm that is trying to learn spatial concepts from passively observing the video stream. We want the algorithm to learn concepts like ‘translating to the right’. To teach the algorithm, we show it videos with different objects undergoing translations. Let us consider an abstraction of this setting to see why inductive learning of a concept like translation can be very expensive in general.

Suppose we are given vector pairs (x1, x1’), (x2, x2’), (x3, x3’) … where x1, x1’ etc are vectors. We are told that all the points in the pair are related by the same transformation *f*. That is, *x1’ = f(x1), x2’ = f(x2),* **..** etc. However, we do not know what the transformation *f* is. The learning problem is to figure out the unknown transformation *f* from the observed set of vector pairs. If the algorithm does not make any assumptions about the nature of these unknown functions, then it is working in an infinitely large hypothesis space with no structured way of exploring it. Therefore learning *f* will not be possible in a reasonable amount of time.

So, at the very least, the algorithm has to make assumptions about the space of transformations. The motor system could provide the set of assumptions to the perceptual learner algorithm to restrict its hypothesis space. However, I think motor system plays a more important role in perceptual learning than just providing a set of assumptions. In my next post on this subject I will describe some deep insights Henri Poincare had on the role of actions in perception.

*Powered by* Qumana

Pingback: wayne