Welcome to the

Vision and Perception Science Lab Homepage

Neural Models of Visual Information Processing, Computational Vision and Learning


Prof. Heiko Neumann

Institute of Neural Information Processing
Faculty of Engineering and Computer Sciences
Ulm University
Oberer Eselsberg
D-89069 Ulm
Germany
Tel. +49 (0)731 50-24158
Fax +49 (0)731 50-24156
email: heiko.neumann@uni-ulm.de

General research overview ( Vision Science @ Ulm University)

The research of our group focuses on the investigation of mechanisms and the underlying structure of visual information processing in biological and technical systems (biological and machine vision) as well as their adaptation to changing environments (visual learning). The topics of our investigation primarily focus on mechanisms of visual information processing (see details in section Research interests and facilities below). The neural computational mechanisms that we use for the modeling of generating complex visual behavior and adaptation also transfer to related processes such as auditory processing and the control of (invertebrate) motor pattern generation.

Our research program utilizes an integrative approach based on the analysis of empirical data derived from psychophysics, neuroscience and imaging studies, and the mathematical and computational investigation of the underlying neural processes. We conduct psychophysical experiments in our computational perception lab where we utilize 2D and 3D Computer Graphics methods for synthetic image generation to create test stimuli and sequences. Such data aims to simulate complex visual displays and animations, e.g., for spatio-temporal grouping, surface perception, or spatial navigation in various environments. We also conduct experiments using head and eye tracking facilities. Eye tracking particularly enables us to record eye movement traces and length of fixations deployed to scenic objects. The results of such experimental investigations (i) further guide our modeling to gain insights into the computational mechanisms of brain function and (ii) advances application oriented investigations into, e.g., attention processes for feature selection.

The results of neural modeling gain new insights into the computational mechanisms underlying complex brain function and steer the development of new approaches and mechanisms for computational vision, image processing and (space-variant) active vision. These developments contribute new methods to several application areas such as, e.g., vision in perceptual and attentive man-computer interfaces, biometric systems, automotive technologies, medical image analysis and recognition, and visually-guided robotics. The development of future emergent technology in human-computer interaction (HCI) aims at developing advanced mechanisms endowing computers with more human-like capabilities and performance. In order to provide a cross-disciplinary forum of joint research activities we have founded an interdisciplinary Competence Center Perception and Interactive Technology (PIT) (together with the Institutes of Media Computing and Information Technology, respectively, at Ulm University). Here, researchers and technology are brought together from various scientific disciplines such as, e.g., computer science and engineering sciences, medical, psychological and neurosciences.

A brief summary of our overall research statement and recent research topics of our group can be found on the Vision Science @ Ulm University poster.



Research interests and facilities

Modeling mechanisms of static surface perception
Surfaces of environmental objects create structure in the ambient light patterns that enable visually guided interaction with the environment, such as, e.g., avoiding obstacles, grasping objects, or locating and identifying members in a social group. In order to develop a coherent model of core mechanisms and their interaction we investigate, a,ong others, the neural mechanisms that underlie the detection of features, the grouping of extended boundaries, or texture segregation, and the regularization of such low- and mid-level vision processes. We particularly pursue an approach that highlights the importance of feedforward and feed-back connections in neural network architecture.

Modeling mechanisms of motion perception
The neural mechanisms underlying motion segregation and integration still remain unclear to a large extent. Local motion estimates often remain ambiguous (aperture problem) when lacking localized form components, such as corners or junctions. Even in the presence of such features, local motion estimates may be erroneous if these were generated by mutual occlusions of different objects. We have developed neural model mechanisms of visual motion analysis to robustly disambiguate locally conflicting cues. Building upon these local motion estimates large-field optical flow patterns are extracted to generate distributed representations of neural activation. These responses contribute to the perception of transparent motion and can be used to control visual tasks such as navigation in complex 3D environments.

Neurodynamical modeling and perceptual psychophysics
We study the computational processes underlying perceptual and cognitive mechanisms at different levels of abstraction, namely the dynamic behavior of ensembles of neurons with their spatio-temporal firing patterns and as well as the dynamics at the system level represented by the mean activities of model neurons (firing rates). The mathematical modeling and computer simulations of such models help to understand the mechanisms and functions underlying visually guided behavior in biological systems. These results are further utilized to design new bioinspired and neuromorphic technology to enhance, e.g., robotic and human-computer interface technology. We also conduct psychophysical experiments in our perception laboratory to investigate behavior in human perception and cognition. The data is again used to steer the modeling process.

Mechanisms of decision-making and perceptual learning
We investigate mechanisms underlying the perceptual learning of mechanisms for complex feature extraction as well as velocity extraction. In addition, neural decision-making processes in motion perception are investigated which serve as a model to link neural data with perceptual behavior, such as, e.g., in pattern discimination tasks. Furthermore, we will assess the influence of response modulation through learning by utilizing large-scale neural network modelling. The approaches considered here utilize biologically realistic learning mechanisms, namely, Hebbian learning implemented by spike-timing dependent plasticity (STDP). The results of these computational investigations in turn will have clinical impact on the development of methods for the improvement of perceptual capabilities in rehabilitation.

Modeling attention, search, eye movements, and social communication
We investigate the role and neural mechanisms of spatial attention in the visual perception of static and moving patterns. The visual detection of human heads and the associated estimation of head pose and view direction provide core mechanisms to enable the assessment of visual communication processes between members of a group. Within the framework of neural architecture of feedforward and feed-back processing, we model the generation of response patterns that are consistent with the biased competition hypothesis of attention. We also employ eye tracking studies to monitor the active motor processes involved in the overt gaze shifting to deploy focussed attention to different locations in a scene. In addition, we pusue an approach to link form and motion information at an intermediate level and to assign focussed attention to spatio-temporal patterns that signal vision-based social communication. On a long-term scale of investigation the analysis of body pose, head orientation and eye gaze helps developing advanced mechanisms in human-computer interaction.

Space-variant vision
An important characteristic of the primate visual system is the space-variant mapping of the visual field onto the cortex (with a small central region of highest resolution (fovea) and decreasing resolution towards the periphery). This, in turn, necessitates eye movements allowing rapid deployment of the high-resolution fovea to interesting regions of the environment. Space-variant active vision is investigated for fixation, smooth pursuit as well as investigating the advantages of reduced sampling at the same time keeping a wide field of view.

Applications - Computer vision, image processing, and pattern recognition
We transfer biologically inspired mechanisms and models to algorithms that can be applied in computer vision scenarios, as well as image processing, and pattern recognition and classification approaches in various domains. The main application domains are in, e.g., the human-computer interaction, medical, and automotive systems research areas.


Current research projects and collaborations

The computer as dialogue companion Perception and interaction in multi-user scenarios
(Funding: State of Baden-Wuerttemberg)
The project aims at developing novel mechanisms and principles of systems architectures for intelligent human-computer interaction and evaluate these by using psycho-physical and psycho-biological test methods. We aim at the creation of advanced user interfaces with extended perceptual and interactive capabilities utilizing adaptive mechanisms and their ability to learn. Such systems should be capable of analyzing the spatio-temporal and user-specific context of interaction, possibly with multiple partners.
(Link: http://www.uni-ulm.de/in/pit.html)

Sensor-based situation estimation and adaptive human-automotive interaction in multi-vehicle scenarios
(Funding: State of Baden-Wuerttemberg)
The project aims at developing methods to evaluate the status, complexity and potential danger of the current traffic situation on the basis of context representations. Intentions and plans of other traffic participants will be considered by integrating information gained from inter-vehicle communication regarding the status of other vehicles. Information and behavioural strategies for the driver-vehicle interaction will be derived and adapted accordingly. The adaptive speech-based driver-vehicle interface will be triggered by the complexity of the current traffic situation as well as the status and attentiveness of the driver.
(Link: http://www.uni-ulm.de/in/pit.html)

Neural Decision-Making in Motion (Decisions-in Motion)
(Funding: European Union, EU FP6 IST Cognitive Systems integrated project, no. 027198)
The research goal of the project "DECISIONS-IN-MOTION" is to describe the neural mechanisms used to guide behaviour in complex visual scenes, in which the living (or animated) agent is in motion and navigates to avoid stationary and/or moving objects. During reporting period P1 we have explored motion-based image segmentation in the visual cortex, and we have begun to derive neural models that explicitly make use of a hierarchy of sensory areas (low-, mid-, high-level visual areas) to extract meaningful information about the location and motion of objects in the environment. One objective of the project is to use the outputs of these units for sensory-based decision-making. This process will weight these inputs and relations between these inputs based on utility functions. The resulting cognitive architecture will be tested in an autonomous robot navigating in complex visual environments to determine the efficiency of the image motion segmentation and goal-directed adaptive behaviour.
(Link: http://www.decisionsinmotion.org/)

Brain Plasticity and Perceptual Learning
(Funding: German Ministry of Research and Technology (BMBF), project no. 01GW0763)
This interdisciplinary project consortium will investigate cortical plasticity in humans to gain better understanding of the neuronal changes and synaptic processes that take place during perceptual learning. Compensatory alterations in brain function evoked by training in brain-damaged patients will also be investigated. The sub-project Learning in the dorsal stream: Mechanisms and deficits of complex motion processing (headed up by Neumann) will develop network models of the dorsal pathway for areas MT and MST. The network properties during learning will be adjusted to fMRI data. Different questions will be explored during the course of the project: (a) How learning will influence local as well as global mechanisms in the representation of velocities at the different stages of motion perception, (b) the role of velocity gradients and their explicit representation during learning of motion discrimination, and (c) how learning can improve the resolution of motion ambiguities in pattern motion configurations. The weight adaptation in the model will be based on biologically plausible mechanisms of Hebbian plasticity.
(Link: http://www.brain-plasticity.org/)

Smart Eyes: Attending and Recognizing Instances of Salient Events (SEARISE)
(Funding: European Union, EU FP7 ICT project, no. 21 58 66-SEARISE)
The SEARISE project develops a trinocular active cognitive vision system, the Smart Eyes, for detection, tracking and categorizaton of salient events and behaviors. Unlike other approaches in video surveillance, the system will have human-like capability to learn continuously from the visual input, self-adjust to ever changing visual environment, fixate salient events and follow their motion, categorize salient events dependent on the context. Inspired by the human visual system, a cyclopean camera will perform wide range monitoring of the visual field while active binocular stereo cameras will fixate and track salient objects, mimicking a focus of attention that switches between different interesting locations.
(Link: http://www.searise.eu/)


Members - Publications - Teaching

Vision group members (Visionaries)

Publications (per year) (last update: Aug. 2008)

Teaching - Course program


Former vision group members