Active Vision for Humanoid Robots

More Info
expand_more

Abstract

Human perception is an active process. By altering its viewpoint rather than passively observing surroundings and by operating on sequences of images rather than on a single frame, the human visual system has the ability to explore the most relevant information based on knowledge, therefore when growing up a human is able to develop cognitive perception. Comparably, for humanoid robots to develop cognitive perception, active vision is indispensable. Humanoid robot research has already nearly half a century history. There are approximately 2000 research papers on active vision published during 1986?2010 that covered a large range of research fields in robotics. Nowadays, the new trend is to use a stereo setup or a Kinect with neck movements to realize active vision. However, human perception is a combination of eyes and neck movement. In order to design such an advanced humanoid active vision system, eye movements with biological inspiration similar to human eyes should be taken into consideration. Depth perception based on pure image information can then be obtained without utilizing any advanced sensors. This thesis presents a complete active vision system with 4 degrees of freedom that works in a similar way as human vision. It is composed of the following parts: 1. The mechanical design has 4 motors with independent vergence angle control, one tilt motor for both eyes and one pan motor for the neck. 2. The controllers simulate the eye movements as humans: saccade eye movements, pursuit eye movements, vestibulo-ocular reflex (VOR) eye movements and vergence eye movements, where motor positions and velocities are controlled with input from an Inertia Measurement Unit (IMU). 3. An optimal feature selection mechanism which is based on various properties of objects is applied before tracking. 4. In order to smoothly pursue and learn an object from different perspectives, three different trackers are used: a color based tracker, an AR marker based tracker for testing, and a robust online tracker. 5. A saliency detector segments the most dominant objects from the scenes and a robust online tracker provides refined segmentations. As a result, the robots have a self-explorative ability for unknown environments. 6. Owing to vergent eyes moving at different angles, intrinsic calibration as well as extrinsic calibration is required to ensure the accuracy of 3D perception. Here the motor positions are utilized together with a robust M-Estimator to recover the geometry between two eyes. 7. Humans utilize multiple cues for depth perception. Depth perception is strongly related to eye movements. Multi-mode depth perception is applied to perceive environment and objects in 3D for further vision tasks such as object recognition, and object grasping. The realized system works within real-time constraints and with low cost cameras and motors. Therefore it provides an affordable solution for industrial applications. In conclusion, active vision can be applied to various applications and it is a rapid-growing research domain. This thesis and its proposed vision system provides an insight into the research field of active humanoid robot vision.

Files