A two-stage scheme for visual object recognition based on selective attention
Günther Palm, Ulrich Kaufmann, Rebecca Fay
We present a two-stage scheme for visual object recognition: First a
window of attention is determined in a picture by means of
low resolution colour and shape information. Then high resolution
visual features (like edges, corners or T-junctions) are extracted
from this window and used in a trained neural network (hierarchically
organized RBF network) for object recognition.
This two-stage process refelects some properties of human or monkey
vision (eye-movements guided by visual attention, high resolution
processing in the fovea, decreasing resolution towards the periphery)
and helps to save computational power and perform sophisticated
object recognition in real-time.
We are presently applying this scheme in soccer-playing robots (our
RoboCup team) and in the MirrorBot project, where a robot has to
grasp different kinds of fruit. In these scenarios we can select
important features for a top-down guidance of the first (attention)
process. We will present the selected windows and the recognition
performance for various pictures. This will demonstrate the
importance of top-down selection of saliency-features in practical
applications.
We want to compare our findings to measurements of human eye-movements
in demanding sensory-motor virtual-reality tasks.