Information driven self-organization of complex robotic behaviors

      by Georg Martius, Ralf Der and Nihat Ay, see here for the pdf and bibtex files
Abstract: Information theory is a powerful tool to express principles to drive autonomous systems because it is domain invariant and allows for an intuitive interpretation. This paper studies the use of the predictive information (PI), also called excess entropy or effective measure complexity, of the sensorimotor process as a driving force to generate behavior. We study nonlinear and nonstationary systems and introduce the time-local predicting information (TiPI) which allows us to derive exact results together with explicit update rules for the parameters of the controller in the dynamical systems framework. In this way the information principle, formulated at the level of behavior, is translated to the dynamics of the synapses. We underpin our results with a number of case studies with high-dimensional robotic systems. We show the spontaneous cooperativity in a complex physical system with decentralized control. Moreover, a jointly controlled humanoid robot develops a high behavioral variety depending on its physics and the environment it is dynamically embedded into. The behavior can be decomposed into a succession of low-dimensional modes that increasingly explore the behavior space. This is a promising way to avoid the curse of dimensionality which hinders learning systems to scale well.
Code: Download the sources: Simulations.zip. In the archiv you find a README file with some basic information.



Video S1: Armband robot starts to locomote and overcomes obstacles. Each joint (hinge or slider) is individually and independently controlled by a one-dimensional TiPI maximizing controller. The locomotions starts due to spontaneous cooperation of the individual components and due to spontaneous symmetry breaking (going to left or right). The text in the video should say Epsilon=0.005. Parameters: (epsilon=0.005, eta=0.005).



Video S2: Humanoid robot on the ground. One high-dimensional TiPI maximizing controller is used here (17 DoF). The controller starts from a small unit initialization, causing the robot to lay calmly. After an initial phase where the parameters adjust to create some activity we observe smooth patterns of behavior that patterns come and go with time. Within short time intervals one sees several repetitions of one mode until it vanishes and a new one emerges. Parameters: (epsilon=0.0002, eta=0.1).



Video S3: Humanoid robot hanging at a bungee. The bungee is not visualized. It acts as a spring force to the upper body. Its upper anchor point (visible as a yellow sphere at the end of the clip) not fixed in x-y, but only in its height, so the humanoid can in principle walk along the ground. See Video S2 for details. Note the different patterns of behavior just because of the different physical situation.



Video S4: Humanoid robot at a high bar. The hands of the robot are attached to a high-bar, however they remain free to rotate and move along the bar. See Video S2 for details. Note the different patterns of behavior just because of the different physical situation.



Video S5: Humanoid robot falling into a pit. See Video S2 for details. Note the different patterns of behavior just because of the different physical situation.

This document was translated from LATEX by HEVEA.