Antonio Loquercio, Ana I Maqueda, Carlos R Del-Blanco, Davide Scaramuzza, DroNet: Learning to Fly by Driving, IEEE Robotics and Automation Letters, Vol. 3 (2), 2018. (Journal Article)
Civilian drones are soon expected to be used in a wide variety of tasks, such as aerial surveillance, delivery, or monitoring of existing architectures. Nevertheless, their deployment in urban environments has so far been limited. Indeed, in unstructured and highly dynamic scenarios, drones face numerous challenges to navigate autonomously in a feasible and safe way. In contrast to traditional “map-localize-plan” methods, this letter explores a data-driven approach to cope with the above challenges. To accomplish this, we propose DroNet: a convolutional neural network that can safely drive a drone through the streets of a city. Designed as a fast eight-layers residual network, DroNet produces two outputs for each single input image: A steering angle to keep the drone navigating while avoiding obstacles, and a collision probability to let the UAV recognize dangerous situations and promptly react to them. The challenge is however to collect enough data in an unstructured outdoor environment such as a city. Clearly, having an expert pilot providing training trajectories is not an option given the large amount of data required and, above all, the risk that it involves for other vehicles or pedestrians moving in the streets. Therefore, we propose to train a UAV from data collected by cars and bicycles, which, already integrated into the urban environment, would not endanger other vehicles and pedestrians. Although trained on city streets from the viewpoint of urban vehicles, the navigation policy learned by DroNet is highly generalizable. Indeed, it allows a UAV to successfully fly at relative high altitudes and even in indoor environments, such as parking lots and corridors. To share our findings with the robotics community, we publicly release all our datasets, code, and trained networks. Video of the experiments: see:https://youtu.be/ow7aw9H4BcA
The project’s code, datasets and trained models are available at:
http://rpg.ifi.uzh.ch/dronet.html |
|
Antonio Rosinol Vidal, Henri Rebecq, Timo Horstschaefer, Davide Scaramuzza, Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High Speed Scenarios, IEEE Robotics and Automation Letters, Vol. 3 (2), 2018. (Journal Article)
Event cameras are bioinspired vision sensors that output pixel-level brightness changes instead of standard intensity frames. These cameras do not suffer from motion blur and have a very high dynamic range, which enables them to provide reliable visual information during high-speed motions or in scenes characterized by high dynamic range. However, event cameras output only little information when the amount of motion is limited, such as in the case of almost still motion. Conversely, standard cameras provide instant and rich information about the environment most of the time (in low-speed and good lighting scenarios), but they fail severely in case of fast motions, or difficult lighting such as high dynamic range or low light scenes. In this letter, we present the first state estimation pipeline that leverages the complementary advantages of these two sensors by fusing in a tightly coupled manner events, standard frames, and inertial measurements. We show on the publicly available Event Camera Dataset that our hybrid pipeline leads to an accuracy improvement of 130% over event-only pipelines, and 85% over standard-frames-only visual-inertial systems, while still being computationally tractable. Furthermore, we use our pipeline to demonstrate-to the best of our knowledge-the first autonomous quadrotor flight using an event camera for state estimation, unlocking flight scenarios that were not reachable with traditional visual-inertial odometry, such as low-light environments and high dynamic range scenes. Videos of the experiments: ttp://rpg.ifi.uzh.ch/ultimateslam.html |
|
Matthias Faessler, Antonio Franchi, Davide Scaramuzza, Differential Flatness of Quadrotor Dynamics Subject to Rotor Drag for Accurate Tracking of High-Speed Trajectories, IEEE Robotics and Automation Letters, Vol. 3 (2), 2018. (Journal Article)
In this paper, we prove that the dynamical model of a quadrotor subject to linear rotor drag effects is differentially flat in its position and heading. We use this property to compute feed-forward control terms directly from a reference trajectory to be tracked. The obtained feed-forward terms are then used in a cascaded, nonlinear feedback control law that enables accurate agile flight with quadrotors. Compared to state-of-the-art control methods, which treat the rotor drag as an unknown disturbance, our method reduces the trajectory tracking error significantly. Finally, we present a method based on a gradient-free optimization to identify the rotor drag coefficients, which are required to compute the feed-forward control terms. The new theoretical results are thoroughly validated trough extensive comparative experiments. |
|
Florentin Liebmann, Augmented reality guided corrective forearm osteotomy using marker-based surgical pin tracking, implemented on a Microsoft HoloLens: a feasibility study, University of Zurich, Faculty of Business, Economics and Informatics, 2018. (Master's Thesis)
3D printed patient-specific instruments (PSI) ensure the correct translation of a computer-assisted preoperative planning into the operative room and are the gold standard for corrective osteotomies. Although PSI highly increase the accuracy of clinical outcomes, their long-standing production time and high costs can make them unattractive. Furthermore, they lack on-site flexibility to support intraoperative changes and cannot provide interactive or additional information. Augmented reality (AR) has the potential to overcome these limitations. We implemented a marker-based system that employs tracking of surgical pins to provide intraoperative navigation and visualization of a preoperative planning. Our method was validated technically and evaluated clinically by means of a corrective osteotomy on a cadaveric ulna. The current state of the system did not allow for an accuracy that meets clinical requirements, but the feasibility for AR in complex orthopedic procedures was demonstrated. |
|
Daniel Gehrig, Henri Rebecq, Guillermo Gallego, Davide Scaramuzza, Asynchronous, Photometric Feature Tracking Using Events and Frames, In: Computer Vision – ECCV 2018, Springer, Cham, p. 766 - 781, 2018. (Book Chapter)
We present a method that leverages the complementarity of event cameras and standard cameras to track visual features with low latency. Event cameras are novel sensors that output pixel-level brightness changes, called "events". They offer signicant advantages over standard cameras, namely a very high dynamic range, no motion blur, and a latency in the order of microseconds. However, because the same scene pattern can produce different events depending on the motion direction, establishing event correspondences across time is challenging. By contrast, standard cameras provide intensity measurements (frames) that do not depend on motion direction. Our method extracts features on frames and subsequently tracks them asynchronously using events, thereby exploiting the best of both types of data: the frames provide a photometric representation that does not depend on motion direction and the events provide low-latency updates. In contrast to previous works, which are based on heuristics, this is the first principled method that uses raw intensity measurements directly, based on a generative event model within a maximum-likelihood framework. As a result, our method produces feature tracks that are both more accurate (subpixel accuracy) and longer than the state of the art, across a wide variety of scenes. |
|
Yi Zhou, Guillermo Gallego, Henri Rebecq, Laurent Kneip, Hongdong Li, Davide Scaramuzza, Semi-dense 3D Reconstruction with a Stereo Event Camera, In: Computer Vision – ECCV 2018, Springer, Cham, p. 242 - 258, 2018. (Book Chapter)
Event cameras are bio-inspired sensors that oer several advantages, such as low latency, high-speed and high dynamic range, to tackle challenging scenarios in computer vision. This paper presents a solution to the problem of 3D reconstruction from data captured by a stereo event-camera rig moving in a static scene, such as in the context of stereo Simultaneous Localization and Mapping. The proposed method consists of the optimization of an energy function designed to exploit small-baseline spatio-temporal consistency of events triggered across both stereo image planes. To improve the density of the reconstruction and to reduce the uncertainty of the estimation, a probabilistic depth-fusion strategy is also developed. The resulting method has no special requirements on either the motion of the stereo event-camera rig or on prior knowledge about the scene. Experiments demonstrate our method can deal with both texture-rich scenes as well as sparse scenes, outperforming state-of-the-art stereo methods based on event data image representations. |
|
Kartik Mohta, Michael Watterson, Yash Mulgaonkar, Sikang Liu, Chao Qu, Anurag Makineni, Kelsey Saulnier, Ke Sun, Alex Zhu, Jeffrey Delmerico, Konstantinos Karydis, Nikolay Atanasov, Giuseppe Loianno, Davide Scaramuzza, Kostas Daniilidis, Camillo Jose Taylor, Vijay Kumar, Fast, autonomous flight in GPS-denied and cluttered environments, Journal of Field Robotics, Vol. 35 (1), 2018. (Journal Article)
One of the most challenging tasks for a flying robot is to autonomously navigate between target locations quickly and reliably while avoiding obstacles in its path, and with little to no a-priori knowledge of the operating environment. This challenge is addressed in the present paper. We describe the system design and software architecture of our proposed solution, and showcase how all the distinct components can be integrated to enable smooth robot operation. We provide critical insight on hardware and software component selection and development, and present results from extensive experimental testing in real-world warehouse environments. Experimental testing reveals that our proposed solution can deliver fast and robust aerial robot autonomous navigation in cluttered, GPS-denied environments. |
|
Raffael Theiler, Event-Camera Dataset Generation from Photorealistically Rendered Scenes, University of Zurich, Faculty of Business, Economics and Informatics, 2018. (Bachelor's Thesis)
With the increased use of Dynamic and Active-pixel Vision Sensors (DAVIS) in mobile robotics, the demand for application themed datasets with ground truth increases.
Sensor systems for detecting accurate ground truth, the DAVIS sensors and the associated effort cause high financial and temporal costs.
We present an easy to use, modular simulator that operates on top of Unreal Engine and generates photorealistic datasets with DVS and IMU simulation to overcome this barrier.
In addition to a trajectory recording mode, the simulator also supports the editing and playback of existing trajectories, multi-view recording and a configurable field of view.
A dataset generated in our simulator has intrinsically accurate and precise ground truth and is reproduceable with modifications to the environment or configuration, bringing great flexibility to the user.
In this paper, we describe the functionality of the simulator as well as the challenges encountered in dealing with a video game rendering pipeline. |
|
Henri Rebecq, Guillermo Gallego Bonet, Davide Scaramuzza, Simultaneous Localization and Mapping with an Event Camera (Patent), 2018. (Other Publication)
Pub. No.: WO/2018/037079. International Application No.: PCT/EP2017/071331 |
|
Giuseppe Loianno, Davide Scaramuzza, Vijay Kumar, Special Issue on High-Speed Vision-Based Autonomous Navigation of UAVs, Journal of Field Robotics, Vol. 1 (1), 2018. (Journal Article)
This first special issue of the Journal of Field Robotics (JFR) on vision-based high speed autonomous navigation of UAVs aims to establish a baseline in the field of autonomous navigation of UAVs using vision and IMU as the main sensing modalities. The goal of the research reported in this special issue is to show the improvements and present the most recent state of the art to execute fast autonomous operations with MAVs. The proposed approaches will contribute to inform the community with the most recent and innovative approaches and to extend the capabilities of current and future robotic missions. |
|
Jeffrey Delmerico, Stefan Isler, Reza Sabzevari, Davide Scaramuzza, A comparison of volumetric information gain metrics for active 3D object reconstruction, Autonomous Robots, Vol. 42 (2), 2018. (Journal Article)
In this paper, we investigate the following question: when performing next best view selection for volumetric 3D reconstruction of an object by a mobile robot equipped with a dense (camera-based) depth sensor, what formulation of information gain is best? To address this question, we propose several new ways to quantify the volumetric information (VI) contained in the voxels of a probabilistic volumetric map, and compare them to the state of the art with extensive simulated experiments. Our proposed formulations incorporate factors such as visibility likelihood and the likelihood of seeing new parts of the object. The results of our experiments allow us to draw some clear conclusions about the VI formulations that are most effective in different mobile-robot reconstruction scenarios. To the best of our knowledge, this is the first comparative survey of VI formulation performance for active 3D object reconstruction. Additionally, our modular software framework is adaptable to other robotic platforms and general reconstruction problems, and we release it open source for autonomous reconstruction tasks. |
|
Guillermo Gallego, Jon E A Lund, Elias Müggler, Henri Rebecq, Tobi Delbruck, Davide Scaramuzza, Event-based, 6-DOF Camera Tracking from Photometric Depth Maps, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 1 (1), 2017. (Journal Article)
Event cameras are bio-inspired vision sensors that output pixel-level brightness changes instead of standard intensity frames. These cameras do not suffer from motion blur and have a very high dynamic range, which enables them to provide reliable visual information during high-speed motions or in scenes characterized by high dynamic range. These features, along with a very low power consumption, make event cameras an ideal complement to standard cameras for VR/AR and video game applications. With these applications in mind, this paper tackles the problem of accurate, low-latency tracking of an event camera from an existing photometric depth map (i.e., intensity plus depth information) built via classic dense reconstruction pipelines. Our approach tracks the 6-DOF pose of the event camera upon the arrival of each event, thus virtually eliminating latency. We successfully evaluate the method in both indoor and outdoor scenes and show that—because of the technological advantages of the event camera—our pipeline works in scenes characterized by high-speed motion, which are still unaccessible to standard cameras. |
|
Davide Falanga, Alessio Zanchettin, Alessandro Simovic, Jeffrey Delmerico, Davide Scaramuzza, Vision-based Autonomous Quadrotor Landing on a Moving Platform, In: IEEE/RSJ International Symposium on Safety, Security and Rescue Robotics, IEEE/RSJ, IEEE/RSJ International Symposium on Safety, Security and Rescue Robotics, 2017-10-11. (Conference or Workshop Paper published in Proceedings)
|
|
Titus Cieslewski, Elia Kaufmann, Davide Scaramuzza, Rapid exploration with multi-rotors: A frontier selection method for high speed flight, In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017-09-24. (Conference or Workshop Paper published in Proceedings)
Exploring and mapping previously unknown environments while avoiding collisions with obstacles is a fundamental task for autonomous robots. In scenarios where this needs to be done rapidly, multi-rotors are a good choice for the task, as they can cover ground at potentially very high velocities. Flying at high velocities, however, implies the ability to rapidly plan trajectories and to react to new information quickly. In this paper, we propose an extension to classical frontier
-based exploration that facilitates exploration at high speeds. The extension consists of a reactive mode in which
the multi-rotor rapidly selects a goal frontier from its field of view. The goal frontier is selected in a way that minimizes the change in velocity necessary to reach it. While this approach can increase the total path length, it significantly reduces the exploration time, since the multi-rotor can fly at consistently higher speeds. |
|
Yawei Ye, Titus Cieslewski, Antonio Loquercio, Davide Scaramuzza, Place recognition in semi-dense maps: Geometric and learning-based approaches, In: British Machine Vision Conference, British Machine Vision Conference, 2017-09-04. (Conference or Workshop Paper published in Proceedings)
For robotics and augmented reality systems operating in large and dynamic environments,
place recognition and tracking using vision represent very challenging tasks. Additionally,
when these systems need to reliably operate for very long time periods, such
as months or years, further challenges are introduced by severe environmental changes,
that can significantly alter the visual appearance of a scene. Thus, to unlock long term,
large scale visual place recognition, it is necessary to develop new methodologies for
improving localization under difficult conditions. As shown in previous work, gains in
robustness can be achieved by exploiting the 3D structural information of a scene. The
latter, extracted from image sequences, carries in fact more discriminative clues than
individual images only. In this paper, we propose to represent a scene’s structure with
semi-dense point clouds, due to their highly informative power, and the simplicity of their
generation through mature visual odometry and SLAM systems. Then we cast place
recognition as an instance of pose retrieval and evaluate several techniques, including
recent learning based approaches, to produce discriminative descriptors of semi-dense
point clouds. Our proposed methodology, evaluated on the recently published and challenging
Oxford Robotcar Dataset, shows to outperform image-based place recognition,
with improvements up to 30% in precision across strong appearance changes. To the best
of our knowledge, we are the first to propose place recognition in semi-dense maps. |
|
Elias Müggler, Chiara Bartolozzi, Davide Scaramuzza, Fast event-based corner detection, In: British Machine Vision Conference (BMVC), British Machine Vision Conference (BMVC), London, 2017., 2017-09-04. (Conference or Workshop Paper published in Proceedings)
Event cameras offer many advantages over standard frame-based cameras, such as low latency, high temporal resolution, and a high dynamic range. They respond to pixel-level brightness changes and, therefore, provide a sparse output. However, in textured scenes with rapid motion, millions of events are generated per second. Therefore, state-of-the-art event-based algorithms either require massive parallel computation (e.g., a GPU) or depart from the event-based processing paradigm. Inspired by frame-based pre-processing techniques that reduce an image to a set of features, which are typically the input to higher-level algorithms, we propose a method to reduce an event stream to a corner event stream. Our goal is twofold: extract relevant tracking information (corners do not suffer from the aperture problem) and decrease the event rate for later processing stages. Our event-based corner detector is very efficient due to its design principle, which consists of working on the Surface of Active Events (a map with the timestamp of the latest event at each pixel) using only comparison operations. Our method asynchronously processes event by event with very low latency. Our implementation is capable of processing millions of events per second on a single core (less than a micro-second per event) and reduces the event rate by a factor of 10 to 20. |
|
Henri Rebecq, Timo Horstschaefer, Davide Scaramuzza, Real-time Visual-Inertial Odometry for Event Cameras using Keyframe-based Nonlinear Optimization, In: British Machine Vision Conference, s.n., British Machine Vision Conference, 2017-09-04. (Conference or Workshop Paper published in Proceedings)
|
|
Patrick Widmer, Towards Rotation Estimation from a Single Blurred Image using Deep Learning, University of Zurich, Faculty of Business, Economics and Informatics, 2017. (Bachelor's Thesis)
Visual ego-motion estimation for autonomous robots has developed significantly and become mature during the past few years. However, if the images from the cameras contain motion blur, reliable state estimation is still challenging. Recently, learning-based approaches have achieved great success in dealing with motion blur. Therefore, in this thesis, we explore learning-based methods on several problems related to visual ego-motion estimation from a single blurred image. In particular, we find that classifying the rotation axis from a fixed set of axes using a small CNN works well. Otherwise, we find that estimating the rotation axis and angle at the same time without restricting the axes is more difficult, even for a pretrained deep CNN. We compare different approaches for solving this problem. We implement a data generation pipeline using Blender and use the simulated images and ground truth to train the CNNs.
|
|
Philipp Foehn, Davide Falanga, Naveen Kuppuswamy, Russ Tedrake, Davide Scaramuzza, Fast trajectory optimization for agile quadrotor maneuvers with a cable-suspended payload, In: Robotics: Science and Systems, Robotics: Science and Systems, 2017-07-12. (Conference or Workshop Paper published in Proceedings)
Executing agile quadrotor maneuvers with cable-suspended payloads is a challenging problem and complications induced by the dynamics typically require trajectory optimization. State-of-the-art approaches often need significant computation time and complex parameter tuning. We present a novel dynamical model and a fast trajectory optimization algorithm for quadrotors with a cable-suspended payload. Our first contribution is a new formulation of the suspended payload behavior, modeled as a link attached to the quadrotor with a combination of two revolute joints and a prismatic joint, all being passive. Differently from state of the art, we do not require the use of hybrid modes depending on the cable tension. Our second contribution is a fast trajectory optimization technique for the aforementioned system. Our model enables us to pose the trajectory optimization problem as a Mathematical Program with Complementarity Constraints (MPCC). Desired behaviors of the system (e.g., obstacle avoidance) can easily be formulated within this framework. We show that our approach outperforms the state of the art in terms of computation speed and guarantees feasibility of the trajectory with respect to both the system dynamics and control input saturation, while utilizing far fewer tuning parameters. We experimentally validate our approach on a real quadrotor showing that our method generalizes to a variety of tasks, such as flying through desired waypoints while avoiding obstacles, or throwing the payload toward a desired target. To the best of our knowledge, this is the first time that three-dimensional, agile maneuvers exploiting the system dynamics have been achieved on quadrotors with a cable-suspended payload. |
|
Valentina Vasco, A. Glover, Elias Müggler, Davide Scaramuzza, Lorenzo Natale, Chiara Bartolozzi, Independent motion detection with event-driven cameras, In: IEEE International Conference on Advanced Robotics, IEEE, IEEE International Conference on Advanced Robotics, 2017-07-10. (Conference or Workshop Paper published in Proceedings)
Unlike standard cameras that send intensity images at a constant frame rate, event-driven cameras asynchronously report pixel-level brightness changes, offering low latency and high temporal resolution (both in the order of micro-seconds). As such, they have great potential for fast and low power vision algorithms for robots. Visual tracking, for example, is easily achieved even for very fast stimuli, as only moving objects cause brightness changes. However, cameras mounted on a moving robot are typically non-stationary and the same tracking problem becomes confounded by background clutter events due to the robot ego-motion. In this paper, we propose a method for segmenting the motion of an independently moving object for event-driven cameras. Our method detects and tracks corners in the event stream and learns the statistics of their motion as a function of the robot’s joint velocities when no independently moving objects are present. During robot operation, independently moving objects are identified by discrepancies between the predicted corner velocities from ego-motion and the measured corner velocities. We validate the algorithm on data collected from the neuromorphic iCub robot. We achieve a precision of 90% and show that the method is robust to changes in speed of both the head and the target. |
|