Robotics and Biology Laboratory

Comparison of Filter Architectures for Sensor Fusion of Vision and Touch

People

Nina Grabka

Pia Bideau

Oliver Brock

Abstract

For robots to interact with objects in unknown environments, sensor data must
be acquired and the object’s position must be constantly estimated. In this
work, the sensor fusion of RGB-D camera data, proprioception and force sensor
data is investigated. The robustness, accuracy, and applicability of various
fusion approaches are studied, with a particular focus on cross-correlations
between sensor data. For this purpose, an object is moved on a flat surface
with a robotic arm and the recorded sensor data is collected as a dataset.
The goal is to create a dataset that is based exclusively on real sensor data
and covers various possible failure cases both from the excitation (pushing,
kicking) and from sensor disturbances (vision, force). Prior work exist that
evaluate sensor fusion on partially simulated and artificially perturbed data
and work comparing different fusion algorithms. Here, however, only real-
world data is used and specifically Kalman filter based fusion architectures are
compared. A single multimodal EKF is compared with two architectures that
fuse seperate Vision and Touch EKF estimates. Of the latter, a distinction is
made between a feedback architecture that feeds the fused estimate back into
the individual filters and a feedforward architecture in which the unimodal filters
are independent. The feedback architecture results in high cross-correlations
between the estimates of the unimodal filters, so only Covariance Intersection
should be used to fuse them. In the feedback architecture, naive fusion is
also reasonable. There are indications that feedforward fusion may be more
robust in the presence of strong perturbations, especially when sensor data are
not available for an extended period of time. Moreover, the accuracy of the
estimation is comparable to that from feedback fusion and the single multimodal
EKF with the added advantage that the unimodal estimates are separately
available.