Christian Forster, Luca Carlone, Frank Dellaert, Davide Scaramuzza, IMU preintegration on manifold for efficient visual-inertial maximum-a-posteriori estimation, In: Robotics: Science and Systems (RSS), Unknown, Rome, Italy, 2015-07-13. (Conference or Workshop Paper published in Proceedings)
Recent results in monocular visual-inertial navigation (VIN) have shown that optimization-based approaches outperform filtering methods in terms of accuracy due to their capability to relinearize past states. However, the improvement comes at the cost of increased computational complexity. In this paper, we address this issue by preintegrating inertial measurements between selected keyframes. The preintegration allows us to accurately summarize hundreds of inertial measurements into a single relative motion constraint. Our first contribution is a preintegration theory that properly addresses the manifold structure of the rotation group and carefully deals with uncertainty propagation. The measurements are integrated in a local frame, which eliminates the need to repeat the integration when the linearization point changes while leaving the opportunity for belated bias corrections. The second contribution is to show that the preintegrated IMU model can be seamlessly integrated in a visual-inertial pipeline under the unifying framework of factor graphs. This enables the use of a structureless model for visual measurements, further accelerating the computation. The third contribution is an extensive evaluation of our monocular VIN pipeline: experimental results confirm that our system is very fast and demonstrates superior accuracy with respect to competitive state-of-the-art filtering and optimization algorithms, including off-the-shelf systems such as Google Tango. |
|
Elias Müggler, Guillermo Gallego, Davide Scaramuzza, Continuous-time trajectory estimation for event-based vision sensors, In: Robotics: Science and Systems (RSS), s.n., Rome, Italy, 2015-07-13. (Conference or Workshop Paper published in Proceedings)
Event-based vision sensors, such as the Dynamic Vision Sensor (DVS), do not output a sequence of video frames like standard cameras, but a stream of asynchronous events. An event is triggered when a pixel detects a change of brightness in the scene. An event contains the location, sign, and precise timestamp of the change. The high dynamic range and temporal resolution of the DVS, which is in the order of micro-seconds, make this a very promising sensor for high-speed applications, such as robotics and wearable computing. However, due to the fundamentally different structure of the sensor’s output, new algorithms that exploit the high temporal resolution and the asynchronous nature of the sensor are required. In this paper, we address ego-motion estimation for an event-based vision sensor using a continuous-time framework to directly integrate the information conveyed by the sensor. The DVS pose trajectory is approximated by a smooth curve in the space of rigid-body motions using cubic splines and it is optimized according to the observed events. We evaluate our method using datasets acquired from sensor-in-the-loop simulations and onboard a quadrotor performing flips. The results are compared to the ground truth, showing the good performance of the proposed technique. |
|
Matthias Fässler, Flavio Fontana, Christian Forster, Davide Scaramuzza, Automatic re-initialization and failure recovery for aggressive flight with a monocular vision-based quadrotor, In: IEEE International Conference on Robotics and Automation (ICRA), Institute of Electrical and Electronics Engineers ( IEEE), Seattle WA, 2015-05-26. (Conference or Workshop Paper published in Proceedings)
Autonomous, vision-based quadrotor flight is widely regarded as a challenging perception and control problem since the accuracy of a flight maneuver is strongly influenced by the quality of the on-board state estimate. In addition, any vision-based state estimator can fail due to the lack of visual information in the scene or due to the loss of feature tracking after an aggressive maneuver. When this happens, the robot should automatically re-initialize the state estimate to maintain its autonomy and, thus, guarantee the safety for itself and the environment. In this paper, we present a system that enables a monocular-vision–based quadrotor to automatically recover from any unknown, initial attitude with significant velocity, such as after loss of visual tracking due to an aggressive maneuver. The recovery procedure consists of multiple stages, in which the quadrotor, first, stabilizes its attitude and altitude, then, re-initializes its visual state-estimation pipeline before stabilizing fully autonomously. To experimentally demonstrate the performance of our system, we aggressively throw the quadrotor in the air by hand and have it recover and stabilize all by itself. We chose this example as it simulates conditions similar to failure recovery during aggressive flight. Our system was able to recover successfully in several hundred throws in both indoor and outdoor environments. |
|
Elias Müggler, Christian Forster, Nathan Baumli, Guillermo Gallego, Davide Scaramuzza, Lifetime estimation of events from dynamic vision sensors, In: IEEE International Conference on Robotics and Automation (ICRA), Institute of Electrical and Electronics Engineers (IEEE), Seattle WA, 2015-05-26. (Conference or Workshop Paper published in Proceedings)
We propose an algorithm to estimate the “lifetime” of events from retinal cameras, such as a Dynamic Vision Sensor (DVS). Unlike standard CMOS cameras, a DVS only transmits pixel-level brightness changes (“events”) at the time they occur with micro-second resolution. Due to its low latency and sparse output, this sensor is very promising for highspeed mobile robotic applications. We develop an algorithm that augments each event with its lifetime, which is computed from the event’s velocity on the image plane. The generated stream of augmented events gives a continuous representation of events in time, hence enabling the design of new algorithms that outperform those based on the accumulation of events over fixed, artificially-chosen time intervals. A direct application of this augmented stream is the construction of sharp gradient (edge-like) images at any time instant. We successfully demonstrate our method in different scenarios, including highspeed quadrotor flips, and compare it to standard visualization methods. |
|
Christian Forster, Matthias Faessler, Flavio Fontana, Manuel Werlberger, Davide Scaramuzza, Continuous on-board monocular-vision-based elevation mapping applied to autonomous landing of micro aerial vehicles, In: IEEE International Conference on Robotics and Automation (ICRA), Institute of Electrical and Electronics Engineers (IEEE), Seattle, US, 2015-05-26. (Conference or Workshop Paper published in Proceedings)
In this paper, we propose a resource-efficient system for real-time 3D terrain reconstruction and landing-spot detection for micro aerial vehicles. The system runs on an on-board smartphone processor and requires only the input of a single downlooking camera and an inertial measurement unit. We generate a two-dimensional elevation map that is probabilistic, of fixed size, and robot-centric, thus, always covering the area immediately underneath the robot. The elevation map is continuously updated at a rate of 1 Hz with depth maps that are triangulated from multiple views using recursive Bayesian estimation. To highlight the usefulness of the proposed mapping framework for autonomous navigation of micro aerial vehicles, we successfully demonstrate fully autonomous landing including landing-spot detection in real-world experiments. |
|
Volker Grabe, Heinrich H Bülthoff, Davide Scaramuzza, Paolo Robuffo Giordano, Nonlinear Ego-Motion Estimation from Optical Flow for Online Control of a Quadrotor UAV, International Journal of Robotics Research, Vol. 34 (8), 2015. (Journal Article)
|
|
András L Majdik, Damiano Verda, Yves Albers-Schoenberg, Davide Scaramuzza, Air-ground matching: appearance-based GPS-denied urban localization of micro aerial vehicles, Journal of Field Robotics, Vol. 32 (7), 2015. (Journal Article)
In this paper, we address the problem of globally localizing and tracking the pose of a camera-equipped micro aerial vehicle (MAV) flying in urban streets at low altitudes without GPS. An image-based global positioning system is introduced to localize the MAV with respect to the surrounding buildings. We propose a novel air-ground image-matching algorithm to search the airborne image of the MAV within a ground-level, geotagged image database. Based on the detected matching image features, we infer the global position of the MAV by back-projecting the corresponding image points onto a cadastral three-dimensional city model. Furthermore, we describe an algorithm to track the position of the flying vehicle over several frames and to correct the accumulated drift of the visual odometry whenever a good match is detected between the airborne and the ground-level images. The proposed approach is tested on a 2 km trajectory with a small quadrocopter flying in the streets of Zurich. Our vision-based global localization can robustly handle extreme changes in viewpoint, illumination, perceptual aliasing, and over-season variations, thus outperforming conventional visual place-recognition approaches. The dataset is made publicly available to the research community. To the best of our knowledge, this is the first work that studies and demonstrates global localization and position tracking of a drone in urban streets with a single onboard camera. |
|
Chiara Troiani, Agostino Martinelli, Christian Laugier, Davide Scaramuzza, Low computational-complexity algorithms for vision-aided inertial navigation of micro aerial vehicles, Robotics and Autonomous Systems, Vol. 69, 2015. (Journal Article)
|
|
Elias Müggler, Matthias Fässler, Flavio Fontana, Davide Scaramuzza, Aerial-guided navigation of a ground robot among movable obstacles, In: IEEE Intl. Symp. on Safety, Security, and Rescue Robotics (SSRR), Toyaco-cho Cultural Center, Toyako-cho, Hokkaido, Japan, 2014-10-27. (Conference or Workshop Paper published in Proceedings)
We demonstrate the fully autonomous collaboration of an aerial and a ground robot in a mock-up disaster scenario. Within this collaboration, we make use of the individual capabilities and strengths of both robots. The aerial robot first maps an area of interest, then it computes the fastest mission for the ground robot to reach a spotted victim and deliver a first-aid kit. Such a mission includes driving and removing obstacles in the way while being constantly monitored and commanded by the aerial robot. Our mission-planning algorithm distinguishes between movable and fixed obstacles and considers both the time for driving and removing obstacles. The entire mission is executed without any human interaction once the aerial robot is launched and requires a minimal amount of communication between the robots. We describe both the hardware and software of our system and detail our mission-planning algorithm. We present exhaustive results of both simulation and real experiments. Our system was successfully demonstrated more than 20 times at a trade fair. |
|
Elias Müggler, Basil Huber, Davide Scaramuzza, Event-based, 6-DOF pose tracking for high-speed maneuvers, In: IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS), Institute of Electrical and Electronics Engineers, Chicago, IL, USA, 2014-09-14. (Conference or Workshop Paper published in Proceedings)
In the last few years, we have witnessed impressive demonstrations of aggressive flights and acrobatics using quadrotors. However, those robots are actually blind. They do not see by themselves, but through the “eyes” of an external motion capture system. Flight maneuvers using onboard sensors are still slow compared to those attainable with motion capture systems. At the current state, the agility of a robot is limited by the latency of its perception pipeline. To obtain more agile robots, we need to use faster sensors. In this paper, we present the first onboard perception system for 6-DOF localization during high-speed maneuvers using a Dynamic Vision Sensor (DVS). Unlike a standard CMOS camera, a DVS does not wastefully send full image frames at a fixed frame rate. Conversely, similar to the human eye, it only transmits pixel-level brightness changes at the time they occur with microsecond resolution, thus, offering the possibility to create a perception pipeline whose latency is negligible compared to the dynamics of the robot. We exploit these characteristics to estimate the pose of a quadrotor with respect to a known pattern during high-speed maneuvers, such as flips, with rotational speeds up to 1,200°/s. Additionally, we provide a versatile method to capture ground-truth data using a DVS. |
|
Davide Scaramuzza, Michael C Achtelik, Lefteris Doitsidis, Friedrich Fraundorfer, Elias Kosmatopoulos, Agostino Martinelli, Markus W Achtelik, Margarita Chli, Savvas Chatzichristofis, Laurent Kneip, Daniel Gurdan, Lionel Heng, Gim Hee Lee, Simon Lynen, Lorenz Meier, Marc Pollefeys, Alessandro Renzaglia, Roland Siegwart, Jan Carsten Stumpf, Petri Tanskanen, Chiara Troiani, Stephan Weiss, Vision-controlled micro flying robots: from system design to autonomous navigation and mapping in GPS-denied environments, IEEE Robotics and Automation Magazine, Vol. 21 (3), 2014. (Journal Article)
Autonomous microhelicopters will soon play a major role in tasks like search and rescue, environment monitoring, security surveillance, and inspection. If they are further realized in small scale, they can also be used in narrow outdoor and indoor environments and represent only a limited risk for people. However, for such operations, navigating based only on global positioning system (GPS) information is not sufficient. Fully autonomous operation in cities or other dense environments requires microhelicopters to fly at low altitudes, where GPS signals are often shadowed, or indoors and to actively explore unknown environments while avoiding collisions and creating maps. This involves a number of challenges on all levels of helicopter design, perception, actuation, control, and navigation, which still have to be solved. The Swarm of Micro Flying Robots (SFLY) project was a European Union-funded project with the goal of creating a swarm of vision-controlled microaerial vehicles (MAVs) capable of autonomous navigation, three-dimensional (3-D) mapping, and optimal surveillance coverage in GPS-denied environments. The SFLY MAVs do not rely on remote control, radio beacons, or motion-capture systems but can fly all by themselves using only a single onboard camera and an inertial measurement unit (IMU). This article describes the technical challenges that have been faced and the results achieved from hardware design and embedded programming to vision-based navigation and mapping, with an overview of how all the modules work and how they have been integrated into the final system. Code, data sets, and videos are publicly available to the robotics community. Experimental results demonstrating three MAVs navigating autonomously in an unknown GPS-denied environment and performing 3-D mapping and optimal surveillance coverage are presented. |
|
Christian Forster, Matia Pizzoli, Davide Scaramuzza, Appearance-based active, monocular, dense reconstruction for micro aerial vehicles, In: Robotics: Science and Systems, Unknown, Berkeley, California, USA, 2014-07-12. (Conference or Workshop Paper published in Proceedings)
In this paper, we investigate the following problem: given the image of a scene, what is the trajectory that a robot- mounted camera should follow to allow optimal dense depth estimation? The solution we propose is based on maximizing the information gain over a set of candidate trajectories. In order to estimate the information that we expect from a camera pose, we introduce a novel formulation of the measurement uncertainty that accounts for the scene appearance (i.e., texture in the reference view), the scene depth and the vehicle pose. We successfully demonstrate our approach in the case of real-time, monocular reconstruction from a micro aerial vehicle and validate the effectiveness of our solution in both synthetic and real experiments. To the best of our knowledge, this is the first work on active, monocular dense reconstruction, which chooses motion trajectories that minimize perceptual ambiguities inferred by the texture in the scene. |
|
Chiara Troiani, Agostino Martinelli, Christian Laugier, Davide Scaramuzza, 2-point-based outlier rejection for camera-IMU systems with applications to micro aerial vehicles, In: IEEE International Conference on Robotics and Automation (ICRA, Institute of Electrical and Electronics Engineers, Hong Kong, 2014-05-31. (Conference or Workshop Paper published in Proceedings)
This paper presents a novel method to perform the outlier rejection task between two different views of a camera rigidly attached to an Inertial Measurement Unit (IMU). Only two feature correspondences and gyroscopic data from IMU measurerments are used to compute the motion hypothesis. By exploiting this 2-point motion parametrization, we propose two algorithms to remove wrong data associations in the feature-matching process for case of a 6DoF motion. We show that in the case of a monocular camera mounted on a quadrotor vehicle, motion priors from IMU can be used to discard wrong estimations in the framework of a 2-point-RANSAC based approach. The proposed methods are evaluated on both synthetic and real data. |
|
Yanhua Jiang, Huiyan Chen, Guangming Xiong, Davide Scaramuzza, ICP stereo visual odometry for wheeled vehicles based on a 1DOF motion prior, In: IEEE International Conference on Robotics and Automation (ICRA), Institute of Electrical and Electronics Engineers, Hong Kong, 2014-05-31. (Conference or Workshop Paper published in Proceedings)
In this paper, we propose a novel, efficient stereo visual-odometry algorithm for ground vehicles moving in outdoor environments. To avoid the drawbacks of computationally-expensive outlier-removal steps based on random-sample schemes, we use a single-degree-of-freedom kinematic model of the vehicle to initialize an Iterative Closest Point (ICP) algorithm that is utilized to select high-quality inliers. The motion is then computed incrementally from the inliers using a standard linear 3D-to-2D pose-estimation method without any additional batch optimization. The performance of the approach is evaluated against state-of-the-art methods on both synthetic data and publicly-available datasets (e.g., KITTI and Devon Island) collected over several kilometers in both urban environments and challenging off-road terrains. Experiments show that the our algorithm outperforms state-of-the-art approaches in accuracy, runtime, and ease of implementation. |
|
Matia Pizzoli, Christian Forster, Davide Scaramuzza, REMODE: probabilistic, monocular dense reconstruction in real time, In: IEEE International Conference on Robotics and Automation (ICRA), Institute of Electrical and Electronics Engineers, Hong Kong, 2014-05-31. (Conference or Workshop Paper published in Proceedings)
In this paper, we solve the problem of estimating dense and accurate depth maps from a single moving camera. A probabilistic depth measurement is carried out in real time on a per-pixel basis and the computed uncertainty is used to reject erroneous estimations and provide live feedback on the reconstruction progress. Our contribution is a novel approach to depth map computation that combines Bayesian estimation and recent development on convex optimization for image processing. We demonstrate that our method outperforms state-of-the-art techniques in terms of accuracy, while exhibiting high efficiency in memory usage and computing power. We call our approach REMODE (REgularized MOnocular Depth Estimation). Our CUDA-based implementation runs at 30Hz on a laptop computer and is released as open-source software. |
|
Andrea Censi, Davide Scaramuzza, Low-latency event-based visual odometry, In: IEEE International Conference on Robotics and Automation (ICRA), Institute of Electrical and Electronics Engineers, Hong Kong, 2014-05-31. (Conference or Workshop Paper published in Proceedings)
The agility of a robotic system is ultimately limited by the speed of its processing pipeline. The use of a Dynamic Vision Sensors (DVS), a sensor producing asynchronous events as luminance changes are perceived by its pixels, makes it possible to have a sensing pipeline of a theoretical latency of a few microseconds. However, several challenges must be overcome: a DVS does not provide the grayscale value but only changes in the luminance; and because the output is composed by a sequence of events, traditional frame-based visual odometry methods are not applicable. This paper presents the first visual odometry system based on a DVS plus a normal CMOS camera to provide the absolute brightness values. The two sources of data are automatically spatiotemporally calibrated from logs taken during normal operation. We design a visual odometry method that uses the DVS events to estimate the relative displacement since the previous CMOS frame by processing each event individually. Experiments show that the rotation can be estimated with surprising accuracy, while the translation can be estimated only very noisily, because it produces few events due to very small apparent motion. |
|
Christian Forster, Matia Pizzoli, Davide Scaramuzza, SVO: fast semi-direct monocular visual odometry, In: IEEE International Conference on Robotics and Automation (ICRA), Institute of Electrical and Electronics Engineers, Hong Kong, 2014-05-31. (Conference or Workshop Paper published in Proceedings)
We propose a semi-direct monocular visual odometry algorithm that is precise, robust, and faster than current state-of-the-art methods. The semi-direct approach eliminates the need of costly feature extraction and robust matching techniques for motion estimation. Our algorithm operates directly on pixel intensities, which results in subpixel precision at high frame-rates. A probabilistic mapping method that explicitly models outlier measurements is used to estimate 3D points, which results in fewer outliers and more reliable points. Precise and high frame-rate motion estimation brings increased robustness in scenes of little, repetitive, and high-frequency texture. The algorithm is applied to micro-aerial-vehicle state-estimation in GPS-denied environments and runs at 55 frames per second on the onboard embedded computer and at more than 300 frames per second on a consumer laptop. We call our approach SVO (Semi-direct Visual Odometry) and release our implementation as open-source software. |
|
András L Majdik, Damiano Verda, Yves Albers-Schoenberg, Davide Scaramuzza, Micro air vehicle localization and position tracking from textured 3D cadastral models, In: IEEE International Conference on Robotics and Automation (ICRA), Institute of Electrical and Electronics Engineers, Hong Kong, 2014-05-31. (Conference or Workshop Paper published in Proceedings)
In this paper, we address the problem of localizing a camera-equipped Micro Aerial Vehicle (MAV) flying in urban streets at low altitudes. An appearance-based global positioning system to localize MAVs with respect to the surrounding buildings is introduced. We rely on an air-ground image matching algorithm to search the airborne image of the MAV within a ground-level Street View image database and to detect image matching points. Based on the image matching points, we infer the global position of the MAV by back-projecting the corresponding image points onto a cadastral 3D city model. Furthermore, we describe an algorithm to track the position of the flying vehicle over several frames and to correct the accumulated drift of the visual odometry, whenever a good match is detected between the airborne MAV and the street-level images. The proposed approach is tested on a dataset captured with a small quadroctopter flying in the streets of Zurich. |
|
Reza Sabzevari, Davide Scaramuzza, Monocular simultaneous multi-body motion segmentation and reconstruction from perspective views, In: IEEE International Conference on Robotics and Automation (ICRA), Institute of Electrical and Electronics Engineers, Hong Kong, 2014-05-31. (Conference or Workshop Paper published in Proceedings)
In this paper, we tackle the problem of mapping multiple 3D rigid structures and estimating their motions from perspective views through a car-mounted camera. The proposed method complements conventional localization and mapping algorithms (such as Visual Odometry and SLAM) to estimate motions of other moving objects in addition to the vehicle's motion. We present a theoretical framework for robust estimation of multiple motions and structures from perspective images. The method is based on the factorization of the projective trajectory matrix without explicit estimation of projective depth values. We exploit the epipolar geometry of calibrated cameras to generate several hypotheses for motion segments. Once the hypotheses are obtained, they are evaluated in an iterative scheme by alternating between estimation of 3D structures and estimation of multiple motions. The proposed framework does not require any knowledge about the number of motions and is robust to noisy image measurements. The method is evaluated on street-level sequences from a car-mounted camera. A benchmark dataset is also used to compare the results with previous works, although most of the related works use synthetic scenes simulating desktop environments. |
|
Elias Müggler, Matthias Fässler, Karl Schwabe, Davide Scaramuzza, A monocular pose estimation system based on infrared LEDs, In: IEEE International Conference on Robotics and Automation (ICRA), Institute of Electrical and Electronics Engineers, Hong Kong, 2014-05-31. (Conference or Workshop Paper published in Proceedings)
We present an accurate, efficient, and robust pose estimation system based on infrared LEDs. They are mounted on a target object and are observed by a camera that is equipped with an infrared-pass filter. The correspondences between LEDs and image detections are first determined using a combinatorial approach and then tracked using a constant-velocity model. The pose of the target object is estimated with a P3P algorithm and optimized by minimizing the reprojection error. Since the system works in the infrared spectrum, it is robust to cluttered environments and illumination changes. In a variety of experiments, we show that our system outperforms state-of-the-art approaches. Furthermore, we successfully apply our system to stabilize a quadrotor both indoors and outdoors under challenging conditions. We release our implementation as open-source software. |
|