|A Minecraft API is now available to train your AGIs
We are working hard on experiments, and software to run experiments. So this week there is no normal blog post. Instead, here’s an eclectic mix of links we’ve noticed recently.
First, AlphaGo continues to make headlines. Of interest to Project AGI is Yann LeCun agreeing with us that unsupervised hierarchical modelling is an essential step in building intelligence with humanlike qualities . We also note this IEEE Spectrum post by Jean-Christophe Baillie  which argues, as we did , that we need to start creating embodied agents.
Of course, to play Minecraft well you need to balance local activities – building, mining etc. – with exploration. Another frontier, beyond AlphaGo, is exploration. Monte-Carlo Tree Search (as used in AlphaGo) explores in more limited ways than humans do, argues John Langford .
Sharing places with robots
This excellent paper on robotic grasping also caught our attention . A key challenge in this area is adaptability to slightly varying circumstances, such as variations in the objects being grasped and their pose relative the the arm. General solutions to these problems will suddenly make robots far more flexible and applicable to a greater range of tasks.
Hierarchical Quilted Self-Organizing Maps & Distributed Representations
Last week I also rediscovered this older paper on Hierarchical-Quilted Self-Organizing Maps (HQSOMs) .This is close to our hearts because we originally believed this type of representation was the right approach for AGI. With the success of Deep Convolutional Networks (DCNs) it’s worth looking back and noticing the similarities between the two. While HQSOM is purely unsupervised learning, (a plus, see comment from Yann LeCun above) DCNs are trained by supervised techniques. However, both methods use small, overlapping, independent units – analogous to biological cortical columns – to classify different patches of the input. The overlapping and independent classifiers lead to robust and distributed representations, which is probably the reason these methods work so well.
Distributed representation is one of the key features of Hawkins’ Hierarchical Temporal Memory (HTM). Fergal Byrne has recently published an updated description of the HTM algorithm  for those interested.
We at Project AGI believe that a grid-like “region” of columns employing a “Winner-Take-All” policy , with overlapping input receptive fields, can produce a distributed representation. Different regions are then connected together into a tree-like structure (acyclic). The result is a hierarchy. Not only does this resemble the state-of-the-art methods of DCNs, but there’s a lot of biological evidence for this type of representation too. This paper by Rinkus  describes columnar features arranged into a hierarchy, with winner-take-all behaviour implemented via local inhibition.
Rinkus says: “Saying only that a group of L2/3 units forms a WTA CM places no a priori constraints on what their tuning functions or receptive fields should look like. This is what gives that functionality a chance of being truly generic, i.e., of applying across all areas and species, regardless of the observed tuning profiles of closely neighboring units.”
But unsupervised learning can’t be the only form of learning. We also need to consider consequences, and so we need reinforcement learning to take account of these. As Yann said, the “cherry on the cake” (this is probably understating the difficulty of the RL component, but right now it seems easier than creating representations).
Shakir’s Machine Learning blog has a great post exploring the biology of reinforcement learning  within the brain. This is a good overview of the topic and useful for ML researchers wanting to access this area.
But regular readers of this blog will remember that we’re obsessed with unfolding or inverting abstract plans into concrete actions. We found a great paper by Manita et al  that shows biological evidence for the translation and propagation of an abstract concept into sensory and motor areas, where it can assist with perception. This is the hierarchy in action.
Long-Short-Term Memory (LSTM)
One more tack before we finish. Thanks to Jay for this link to NVIDIA’s description of LSTMs , an architecture for recurrent neural networks (i.e. the state can depend on the previous state of the cells). It’s a good introduction, but we’re still fans of Monner’s Generalized LSTM .
Now let’s end with something fun. Wired magazine again, describing watching AlphaGo as our first taste of a superhuman intelligence . Although this is a “narrow” intelligence, not a general one, it has qualities beyond anything we’ve experienced in this domain before. What’s more, watching these machines can make us humans better, without any nasty bio-engineering:
“But as hard as it was for Fan Hui to lose back in October and have the loss reported across the globe—and as hard as it has been to watch Lee Sedol’s struggles—his primary emotion isn’t sadness. As he played match after match with AlphaGo over the past five months, he watched the machine improve. But he also watched himself improve. The experience has, quite literally, changed the way he views the game. When he first played the Google machine, he was ranked 633rd in the world. Now, he is up into the 300s. In the months since October, AlphaGo has taught him, a human, to be a better player. He sees things he didn’t see before. And that makes him happy. “So beautiful,” he says. “So beautiful.”