Cognitive Systems & Representation


The challenge

We research and create systems that continuously and autonomously analyze sensory input from scenes in a context- and task-dependent efficient way, guided by perception, scene memory and acquired world knowledge.
Such systems have to build up an expandable and consistent representation of the structures in their surroundings in short-term and long-term memory. This has to happen in a largely self-organized way from sensory data and behavioral needs, with only few hints from outside.

An important aspect is how to make the most efficient use of the knowledge from memory during sensory scene analysis. How does such a system concentrate on the necessary elements of a scene and neglect the irrelevant parts? How does a behavioral task guide this process?


Our working hypothesis is that sensory cognition is a control process that links sensory processing modules in a task- dependent way to an internal scene representation.


For our research, we combine approaches from several disciplines: brain and computer sciences, mathematical modeling, computer simulations and real-world systems validation. We make use of techniques from the following domains:

  • Biologically-inspired processing: Looking at neural circuits and brain processes gives us ideas on howto solve problems related to cognitive processing.
  • Modern computer vision approaches: Differently to ten years ago, current state-of-the-art vision technology is already able to deliver good results on well-defined basic visual tasks like detection, classification, segmentation, target pursuit and extraction of 3D world structure.
  • Machine learning and probabilistic models: Advances in computer technology and algorithmic theory allow the exploitation of efficient adaptation processes and principled ways of dealing with uncertainty.

Research in computer vision and dynamic scene decomposition

Modern computer vision and machine learning provide the necessary processing modules (“visual routines”) that deliver basic object and environment information. We research new methods for the estimation of saliency, 3D structures, symmetries, parts-based decomposition, ego- and object- motion, image segmentation and object pursuit and care for efficient algorithms and implementations.



A focus of our group is in scene analysis and decomposition. For scenes, in order to make sense of the large amount of incoming data, the differently specialized visual routines have to work together in a tightly coupled way. For example, tracking of moving objects requires their detection, their segmentation and their prediction, and can be additionally supported by type information from object classification.

Biologically motivated modeling

The brain is by far the most versatile system when it comes to interpreting scenes for a special behavioral task. We look at biological findings to understand the solutions that have evolved in nature to solve the problems encountered during scene analysis. Biological knowledge influences how we think about the representation of features, objects and scenes, the accounts of feedforward and feedback information flow, the problem of inference and learning in hierarchies, different types of episodic and semantic memory, as well as task driven attention and working memory. Although some of the biological solutions depend largely on the specific biological substrate provided by neurons and brain structures, the structure to many solutions is driven by common underlying principles that arise alike in any model. Finding these principles is a key for the research and the construction of scalable cognitive systems.

Systems that understand visual scenes

When we inspect a crowded visual scene, we can make sense of it within a few 100 ms and trigger appropriate actions. This is basically the quality that we want to achieve in our technical systems when they interprete a scene.
Such a system has to concentrate on salient stimuli. It has to actively select necessary information and neglect irrelevant parts. It has to detect and isolate objects and continuously estimate their dynamics in parallel prediction-confirmation loops.
All this occurs in a coarse-to-fine manner and uses long-term memory about previously encountered, prototypical world structures as well as a short term working memory that integrates accumulated knowledge about the current scene and its context.

Application areas include vision systems for robotics and vehicles that autonomously analyze complex situations. Particularly for traffic scene understanding, cognitive systems are required that allow a combination of various sensory processing modules with extensive background knowledge about the world and the tasks and task-dependent control processes.