Importance of being slow..

What is most amazing about the mammalian visual sytem is its invariance property — that it can recognize objects despite wide amounts of variation. Learning these invariant representations in the correct way is one of challenges in vision and neuroscience research.

When I joined the Redwood Institute (RNI) back in 2003, I was still a novice in neuroscience literature. I had implemented a simple version of HMAX/Neocognitron and also re-implemented Rajesh Rao’s hierarchical Kalman filter paper by then, but still did not know enough about other vision research. Having no knowledge can be an advantage sometimes — it enables original thinking. One day I came up with a thought experiment on learning invariant representations.

This thought experiment is given in more detail in my thesis, but  I will parahrase it here. A cat walking towards its bowl of milk has to know that it is the same bowl of milk even though the images  on cat’s retina vary dramatically from instant to instant. Nobody has supervised the cat to teach it that all those images are actually the same milk bowl — it had to learn it in an unsupervised fashion. Thinking more about it, I concluded that there is no way the cat could do it without using temporal proximity.

It turned out that using time as supervisor is actually not a new idea.  Slow Feature Analysis (SFA) is a method for implementing this idea. This paper from Laurenz Wiskott showed how a feature hierarchy can be constructed using unsupervised learning driven by temporal proximity. Geoff Hinton proposed this idea quite a long time ago. Infact, before slow feature analysis came in there was VisNet from Edmund Rolls lab that implemented this idea in a hierarchy. Even before that Foldiak showed that this idea could work in principle in simple systems. Later I found that it can even be traced back to early philospohers. So much for my original idea!

The SFA paper showed that one can use temporal slowness to extract invariant representations . As you go up the hierarchy, you can have responses that remain invariant while maintaining selectivity between objects. However, the technique used in SFA is not scalable. It relied on specifying a set of non-linear function expansions from the input space and then applying a linear technique to find the slow varying features. The non-linear expansion space that you specify need not contain the invariant functions you are looking for. If you try to increase the repertoire of invariant functions, then you encounter the curse of dimensionality.

Powered by Qumana

This entry was posted in Uncategorized. Bookmark the permalink.

15 Responses to Importance of being slow..

  1. emt training says:

    My cousin recommended this blog and she was totally right keep up the fantastic work!

  2. Sean says:

    Interesting post, thanks. Even Numenta’s old algorithms, biologically grounded as they were, seemed to struggle mightily with duplicating the invariance of the human brain. I recall Subutai at Numenta giving a speech last year in which he noted that the vision toolkit had only been scaled up to 50 or so image categories (and probably wasn’t getting great results with that number of categories), while I believe that T. Poggio has said that the human brain learns several thousands types of objects, so we are talking about a hundred-fold difference (realistically, probably more than that). It will be interesting to see whether the new HTM software can begin to bridge this huge gap in the ability to learn invariance. The commercial possibilities for a vision system that could even begin to approach human-like object recognition would be tremendous.

    • dileep says:

      Sean, you are right. Making invariance and selectivity play nicely in the hierarchy is hard thing to do. The solution will look easy, but the search for that solution takes multiple attempts. It turns out that none of the hierarchical systems out there (covolutional nets, Deep belief nets, HMAX, HTMs) have got the hierarchy really correct. The first few versions of the algorithms we built at Numenta were basic starting points for understanding how hierarchy works. I think I have some good ideas on what is going wrong in these hierarchical systems and I am working on those now. Will post here when I have something interesting to share.

      • Sean says:

        Interesting. Hadn’t thought about the hierarchical structure itself being the source of the problem. I will say that it is difficult for me to visualize the way that a simple tree shaped hierarchy could contain concepts as we know them. For instance, when we recognize something as the invariant concept of “shoe” undoubtedly there is a node/region that stores the invariant concept of “shoe.” Yet, we also recognize a shoe by its color, texture, shape, style (athletic, dressy, etc.) and many other traits. None of those traits are stored in the “shoe” representation area, so there must be a way that the brain connects a region that contains a particular invariant representation with any other region that contains concepts that could pertain to a shoe. There must be many, many interconnects both within and between levels of the hierarchy for this to work. Further, I wonder if some traits (color being a good example) are even hierarchical in nature. Color does not seem like something that requires increasingly abstract representations for the brain to figure it out. There are not smaller visual subcomponents that make something “blue” or “white.” One wonders where a concept like color would reside in a hierarchy of space and time. These types of issues are puzzling to me when trying to figure out the brain as a hierarchy.

        • dileep says:

          Sean, I agree with you that the tree structured hierarchical structure is a gross simplification. You could think of the visual cortex as having a mixture of several such hierarchies, some very shallow (you mentioned color as an example) and some very deep, but all sharing data at the appropriate level of abstraction. But there are many smaller steps that need to be taken before we tackle all that complexity. One definite improvement over the tree-structured hierarchy is having continuous levels without boundaries. Numenta is definitely working on that. Even in a simpler hierarchy, many details of about how the representations are learned determine whether the hierarchy works in a scalable manner or not.

  3. This is a good post and may be one that can be followed up to see what are the results

    A chum mailed this link the other day and I am eagerly anticipating your next article. Carry on on the wonderful work.

  4. Found your web site via msn the other day and absolutely love it. Keep up the truly amazing work.

  5. roclafamilia says:

    Helpful blog, bookmarked the website with hopes to read more!

  6. badmash says:

    I just signed up to your blogs rss feed. Will you post more on this subject?

  7. mackdaniel says:

    this was a really nice post, thanks

  8. Pingback: Alexander1

  9. Pingback: harvey

  10. Pingback: Jay

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>