Some robots could now be mistaken for humans. A few of them have a head, two eyes, a mouth and a nose, but their abilities are still inferior to those of humans in many respects. Especially where it concerns their perception skills and capacity for independent navigation, robots still very often come up against their limitations. The essential questions every robot needs to solve are: Where am I? And what’s around me? “When you think about the use of robots under difficult conditions – on the assembly line or also in security-related environments – it’s precisely there that they need a deep understanding of their surroundings,” says Professor Dr.-Ing. Guillermo Gallego, professor of robotic interactive perception at TU Berlin and principal investigator at the Science of Intelligence (SCIoI) Cluster of Excellence.
This classic problem of robotics is also known as “simultaneous localization and mapping” (SLAM) and describes the difficulty caused by the fact that a mobile autonomous system needs both to have a detailed three-dimensional map of its surroundings and to know its exact location, often without having any prior knowledge.
Until now, this essential information has mostly been acquired by means of conventional cameras. Together with colleagues from the University of Hong Kong, Guillermo Gallego has developed a system that could overcome the limitations of conventional cameras and thereby open up completely new fields of application for autonomous robots. The goal of ESVO (event-based stereo visual odometry) is to enable the robot, which is equipped with two special cameras, to create a map of its surroundings and locate itself on this map – even if it is in motion.
“We use highly specialized, biologically inspired cameras, so-called neuromorphic cameras,” says Gallego. Their sensors are based on the human visual system, which is responsible, in particular, for rapid motion detection. These cameras, which are also known as event cameras, do not record their surroundings as conventional images, but work with pixels, each of which reacts extremely quickly and independently of other pixels to every change in light intensity and sends this information in the form of action potentials, also known as spikes. “Conventional cameras record approximately 100 images per second. With event cameras, each individual pixel reacts to a change in intensity within a microsecond and sends no signal at all other than that,” explains Gallego.
With ESVO, scientists have for the first time developed an algorithm that captures these spikes from two cameras, combines them with temporal information, and, working in parallel, creates a map of the surrounding area and calculates the position of the robot eyes within this map. “The decisive advantage of ESVO is that the algorithm is able to take advantage of the low latency, high time resolution and dynamic range of these special cameras and use them to solve the problem of stereo SLAM. For the first time, we can now also enable spatial orientation in difficult lighting conditions,” says the scientist.
Gallego’s research thus fits perfectly into the SCIoI concept, which is based on combining synthetic and analytical intelligence research. “ESVO could, for example, be used in self-driving cars that have to move quickly in dynamic lighting conditions or even in very diffuse light,” he concludes.