Nico Messikommer, Daniel Gehrig, Mathias Gehrig, Davide Scaramuzza, Bridging the Gap Between Events and Frames Through Unsupervised Domain Adaptation, IEEE Robotics and Automation Letters, Vol. 7 (2), 2022. (Journal Article)
Reliable perception during fast motion maneuvers or in high dynamic range environments is crucial for robotic systems. Since event cameras are robust to these challenging conditions, they have great potential to increase the reliability of robot vision. However, event-based vision has been held back by the shortage of labeled datasets due to the novelty of event cameras. To overcome this drawback, we propose a task transfer method to train models directly with labeled images and unlabeled event data. Compared to previous approaches, (i) our method transfers from single images to events instead of high frame rate videos, and (ii) does not rely on paired sensor data. To achieve this, we leverage the generative event model to split event features into content and motion features. This split enables efficient matching between latent spaces for events and images, which is crucial for successful task transfer. Thus, our approach unlocks the vast amount of existing image datasets for the training of event-based neural networks. Our task transfer method consistently outperforms methods targeting Unsupervised Domain Adaptation for object detection by 0.26 mAP (increase by 93%) and classification by 2.7% accuracy. |
|
Leonard Bauersfeld, Davide Scaramuzza, Range, Endurance, and Optimal Speed Estimates for Multicopters, IEEE Robotics and Automation Letters, Vol. 7 (2), 2022. (Journal Article)
Multicopters are among the most versatile mobile robots. Their applications range from inspection and mapping tasks to providing vital reconnaissance in disaster zones and to package delivery. The range, endurance, and speed a multirotor vehicle can achieve while performing its task is a decisive factor not only for vehicle design and mission planning, but also for policy makers deciding on the rules and regulations for aerial robots. To the best of the authors’ knowledge, this work proposes the first approach to estimate the range, endurance, and optimal flight speed for a wide variety of multicopters. This advance is made possible by combining a state-of-the-art first-principles aerodynamic multicopter model based on blade-element-momentum theory with an electric-motor model and a graybox battery model. This model predicts the cell voltage with only 1.3% relative error ( 43.1mV ), even if the battery is subjected to non-constant discharge rates. Our approach is validated with real-world experiments on a test bench as well as with flights at speeds up to 65km/h in one of the world’s largest motion-capture systems. We also present an accurate pen-and-paper algorithm to estimate the range, endurance and optimal speed of multicopters to help future researchers build drones with maximal range and endurance, ensuring that future multirotor vehicles are even more versatile. |
|
Drew Hanover, Philipp Foehn, Sihao Sun, Elia Kaufmann, Davide Scaramuzza, Performance, Precision, and Payloads: Adaptive Nonlinear MPC for Quadrotors, IEEE Robotics and Automation Letters, Vol. 7 (2), 2022. (Journal Article)
Agile quadrotor flight in challenging environments has the potential to revolutionize shipping, transportation, and search and rescue applications. Nonlinear model predictive control (NMPC) has recently shown promising results for agile quadrotor control, but relies on highly accurate models for maximum performance. Hence, model uncertainties in the form of unmodeled complex aerodynamic effects, varying payloads and parameter mismatch will degrade overall system performance. In this letter, we propose L1 -NMPC, a novel hybrid adaptive NMPC to learn model uncertainties online and immediately compensate for them, drastically improving performance over the non-adaptive baseline with minimal computational overhead. Our proposed architecture generalizes to many different environments from which we evaluate wind, unknown payloads, and highly agile flight conditions. The proposed method demonstrates immense flexibility and robustness, with more than 90% tracking error reduction over non-adaptive NMPC under large unknown disturbances and without any gain tuning. In addition, the same controller with identical gains can accurately fly highly agile racing trajectories exhibiting top speeds of 70 km/h, offering tracking performance improvements of around 50% relative to the non-adaptive NMPC baseline. |
|
Manasi Muglikar, Guillermo Gallego, Davide Scaramuzza, ESL: Event-based Structured Light, In: 2021 International Conference on 3D Vision (3DV), IEEE, 2022. (Conference or Workshop Paper published in Proceedings)
Event cameras are bio-inspired sensors providing significant advantages over standard cameras such as low latency, high temporal resolution, and high dynamic range. We propose a novel structured-light system using an event camera to tackle the problem of accurate and high-speed depth sensing. Our setup consists of an event camera and a laser-point projector that uniformly illuminates the scene in a raster scanning pattern during 16 ms. Previous methods match events independently of each other, and so they deliver noisy depth estimates at high scanning speeds in the presence of signal latency and jitter. In contrast, we optimize an energy function designed to exploit event correlations, called spatio-temporal consistency. The resulting method is robust to event jitter and therefore performs better at higher scanning speeds. Experiments demonstrate that our method can deal with high-speed motion and outperform state-of-the-art 3D reconstruction methods based on event cameras, reducing the RMSE by 83% on average, for the same acquisition time. Code and dataset are available at http://rpg.ifi.uzh.ch/esl/. |
|
Manasi Muglikar, Diederik Paul Moeys, Davide Scaramuzza, Event Guided Depth Sensing, In: 2021 International Conference on 3D Vision (3DV), IEEE, 2022. (Conference or Workshop Paper published in Proceedings)
Active depth sensors like structured light, lidar, and time-of-flight systems sample the depth of the entire scene uniformly at a fixed scan rate. This leads to limited spatiotemporal resolution where redundant static information is over-sampled and precious motion information might be under-sampled. In this paper, we present an efficient bio-inspired event-camera-driven depth estimation algorithm. In our approach, we dynamically illuminate areas of interest densely, depending on the scene activity detected by the event camera, and sparsely illuminate areas in the field of view with no motion. The depth estimation is achieved by an event-based structured light system consisting of a laser point projector coupled with a second event-based sensor tuned to detect the reflection of the laser from the scene. We show the feasibility of our approach in a simulated autonomous driving scenario and real indoor sequences using our prototype. We show that, in natural scenes like autonomous driving and indoor environments, moving edges correspond to less than 10% of the scene on average. Thus our setup requires the sensor to scan only 10% of the scene, which could lead to almost 90% less power consumption by the illumination source. While we present the evaluation and proof-of-concept for an event-based structured-light system, the ideas presented here are applicable for a wide range of depth sensing modalities like LIDAR, time-of-flight, and standard stereo. |
|
Mathias Gehrig, Mario Millhausler, Daniel Gehrig, Davide Scaramuzza, E-RAFT: Dense Optical Flow from Event Cameras, In: 2021 International Conference on 3D Vision (3DV), IEEE, 2022. (Conference or Workshop Paper published in Proceedings)
We propose to incorporate feature correlation and sequential processing into dense optical flow estimation from event cameras. Modern frame-based optical flow methods heavily rely on matching costs computed from feature correlation. In contrast, there exists no optical flow method for event cameras that explicitly computes matching costs. Instead, learning-based approaches using events usually resort to the U-Net architecture to estimate optical flow sparsely. Our key finding is that the introduction of correlation features significantly improves results compared to previous methods that solely rely on convolution layers. Compared to the state-of-the-art, our proposed approach computes dense optical flow and reduces the end-point error by 23% on MVSEC. Furthermore, we show that all existing optical flow methods developed so far for event cameras have been evaluated on datasets with very small displacement fields with maximum flow magnitude of 10 pixels. Based on this observation, we introduce a new real-world dataset that exhibits displacement fields with magnitudes up to 210 pixels and 3 times higher camera resolution. Our proposed approach reduces the end-point error on this dataset by 66%. |
|
Philipp Foehn, Dario Brescianini, Elia Kaufmann, Titus Cieslewski, Mathias Gehrig, Manasi Muglikar, Davide Scaramuzza, AlphaPilot: autonomous drone racing, Autonomous Robots, Vol. 46 (1), 2022. (Journal Article)
This paper presents a novel system for autonomous, vision-based drone racing combining learned data abstraction, nonlinear filtering, and time-optimal trajectory planning. The system has successfully been deployed at the first autonomous drone racing world championship: the 2019 AlphaPilot Challenge. Contrary to traditional drone racing systems, which only detect the next gate, our approach makes use of any visible gate and takes advantage of multiple, simultaneous gate detections to compensate for drift in the state estimate and build a global map of the gates. The global map and drift-compensated state estimate allow the drone to navigate through the race course even when the gates are not immediately visible and further enable to plan a near time-optimal path through the race course in real time based on approximate drone dynamics. The proposed system has been demonstrated to successfully guide the drone through tight race courses reaching speeds up to 8m/s and ranked second at the 2019 AlphaPilot Challenge. |
|
Philipp Föhn, Agile Aerial Autonomy: Planning and Control, University of Zurich, Faculty of Business, Economics and Informatics, 2022. (Dissertation)
|
|
Florian Fuchs, Yunlong Song, Elia Kaufmann, Davide Scaramuzza, Peter Durr, Super-Human Performance in Gran Turismo Sport Using Deep Reinforcement Learning, IEEE Robotics and Automation Letters, Vol. 6 (3), 2022. (Journal Article)
Autonomous car racing is a major challenge in robotics. It raises fundamental problems for classical approaches such as planning minimum-time trajectories under uncertain dynamics and controlling the car at the limits of its handling. Besides, the requirement of minimizing the lap time, which is a sparse objective, and the difficulty of collecting training data from human experts have also hindered researchers from directly applying learning-based approaches to solve the problem. In the present work, we propose a learning-based system for autonomous car racing by leveraging a high-fidelity physical car simulation, a course-progress proxy reward, and deep reinforcement learning. We deploy our system in Gran Turismo Sport, a world-leading car simulator known for its realistic physics simulation of different race cars and tracks, which is even used to recruit human race car drivers. Our trained policy achieves autonomous racing performance that goes beyond what had been achieved so far by the built-in AI, and at the same time, outperforms the fastest driver in a dataset of over 50,000 human players. |
|
Mathias Gehrig, Willem Aarents, Daniel Gehrig, Davide Scaramuzza, DSEC: A Stereo Event Camera Dataset for Driving Scenarios, IEEE Robotics and Automation Letters, Vol. 6 (3), 2022. (Journal Article)
Once an academic venture, autonomous driving has received unparalleled corporate funding in the last decade. Still, operating conditions of current autonomous cars are mostly restricted to ideal scenarios. This means that driving in challenging illumination conditions such as night, sunrise, and sunset remains an open problem. In these cases, standard cameras are being pushed to their limits in terms of low light and high dynamic range performance. To address these challenges, we propose, DSEC, a new dataset that contains such demanding illumination conditions and provides a rich set of sensory data. DSEC offers data from a wide-baseline stereo setup of two color frame cameras and two high-resolution monochrome event cameras. In addition, we collect lidar data and RTK GPS measurements, both hardware synchronized with all camera data. One of the distinctive features of this dataset is the inclusion of high-resolution event cameras. Event cameras have received increasing attention for their high temporal resolution and high dynamic range performance. However, due to their novelty, event camera datasets in driving scenarios are rare. This work presents the first high resolution, large scale stereo dataset with event cameras. The dataset contains 53 sequences collected by driving in a variety of illumination conditions and provides ground truth disparity for the development and evaluation of event-based stereo algorithms. |
|
Christian Pfeiffer, Davide Scaramuzza, Human-Piloted Drone Racing: Visual Processing and Control, IEEE Robotics and Automation Letters, Vol. 6 (2), 2022. (Journal Article)
Humans race drones faster than algorithms, despite being limited to a fixed camera angle, body rate control, and response latencies in the order of hundreds of milliseconds. A better understanding of the ability of human pilots of selecting appropriate motor commands from highly dynamic visual information may provide key insights for solving current challenges in vision-based autonomous navigation. This work investigates the relationship between human eye movements, control behavior, and flight performance in a drone racing task. We collected a multimodal dataset from 21 experienced drone pilots using a highly realistic drone racing simulator, also used to recruit professional pilots. Our results show task-specific improvements in drone racing performance over time. In particular, we found that eye gaze tracks future waypoints (i.e., gates), with first fixations occurring on average 1.5 seconds and 16 meters before reaching the gate. Moreover, human pilots consistently looked at the inside of the future flight path for lateral (i.e., left and right turns) and vertical maneuvers (i.e., ascending and descending). Finally, we found a strong correlation between pilots' eye movements and the commanded direction of quadrotor flight, with an average visual-motor response latency of 220 ms. These results highlight the importance of coordinated eye movements in human-piloted drone racing. We make our dataset publicly available. |
|
Daniel Gehrig, Michelle Rüegg, Mathias Gehrig, Javier Hidalgo-Carrio, Davide Scaramuzza, Combining Events and Frames Using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction, IEEE Robotics and Automation Letters, Vol. 6 (2), 2022. (Journal Article)
Event cameras are novel vision sensors that report per-pixel brightness changes as a stream of asynchronous “events”. They offer significant advantages compared to standard cameras due to their high temporal resolution, high dynamic range and lack of motion blur. However, events only measure the varying component of the visual signal, which limits their ability to encode scene context. By contrast, standard cameras measure absolute intensity frames, which capture a much richer representation of the scene. Both sensors are thus complementary. However, due to the asynchronous nature of events, combining them with synchronous images remains challenging, especially for learning-based methods. This is because traditional recurrent neural networks (RNNs) are not designed for asynchronous and irregular data from additional sensors. To address this challenge, we introduce Recurrent Asynchronous Multimodal (RAM) networks, which generalize traditional RNNs to handle asynchronous and irregular data from multiple sensors. Inspired by traditional RNNs, RAM networks maintain a hidden state that is updated asynchronously and can be queried at any time to generate a prediction. We apply this novel architecture to monocular depth estimation with events and frames where we show an improvement over state-of-the-art methods by up to 30% in terms of mean absolute depth error. To enable further research on multimodal learning with events, we release EventScape, a new dataset with events, intensity frames, semantic labels, and depth maps recorded in the CARLA simulator. |
|
Guillem Torrente, Elia Kaufmann, Philipp Föhn, Davide Scaramuzza, Data-Driven MPC for Quadrotors, IEEE Robotics and Automation Letters, Vol. 6 (2), 2022. (Journal Article)
|
|
Sihao Sun, Giovanni Cioffi, Coen de Visser, Davide Scaramuzza, Autonomous Quadrotor Flight Despite Rotor Failure With Onboard Vision Sensors: Frames vs. Events, IEEE Robotics and Automation Letters, Vol. 6 (2), 2022. (Journal Article)
Fault-tolerant control is crucial for safety-critical systems, such as quadrotors. State-of-art flight controllers can stabilize and control a quadrotor even when subjected to the complete loss of a rotor. However, these methods rely on external sensors, such as GPS or motion capture systems, for state estimation. To the best of our knowledge, this has not yet been achieved with only onboard sensors. In this letter, we propose the first algorithm that combines fault-tolerant control and onboard vision-based state estimation to achieve position control of a quadrotor subjected to complete failure of one rotor. Experimental validations show that our approach is able to accurately control the position of a quadrotor during a motor failure scenario, without the aid of any external sensors. The primary challenge to vision-based state estimation stems from the inevitable high-speed yaw rotation (over 20 rd/s) of the damaged quadrotor, causing motion blur to cameras, which is detrimental to visual inertial odometry (VIO). We compare two types of visual inputs to the vision-based state estimation algorithm: standard frames and events. Experimental results show the advantage of using an event camera especially in low light environments due to its inherent high dynamic range and high temporal resolution. We believe that our approach will render autonomous quadrotors safer in both GPS denied or degraded environments. We release both our controller and VIO algorithm open source. |
|
Alexander Dietsche, Giovanni Cioffi, Javier Hidalgo-Carrio, Davide Scaramuzza, Powerline Tracking with Event Cameras, In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2021. (Conference or Workshop Paper published in Proceedings)
Autonomous inspection of powerlines with quadrotors is challenging. Flights require persistent perception to keep a close look at the lines. We propose a method that uses event cameras to robustly track powerlines. Event cameras are inherently robust to motion blur, have low latency, and high dynamic range. Such properties are advantageous for autonomous inspection of powerlines with drones, where fast motions and challenging illumination conditions are ordinary. Our method identifies lines in the stream of events by detecting planes in the spatio-temporal signal, and tracks them through time. The implementation runs onboard and is capable of detecting multiple distinct lines in real time with rates of up to 320 thousand events per second. The performance is evaluated in real-world flights along a powerline. The tracker is able to persistently track the powerlines, with a mean lifetime of the line 10× longer than existing approaches. |
|
Feifei Xia, Integrated Analyses of Genomic Changes Associated with Resistance to Proteasome Inhibitor Treatments in Multiple Myeloma Patients, University of Zurich, Faculty of Business, Economics and Informatics, 2021. (Master's Thesis)
Multiple myeloma is an incurable and treatable hematological cancer. The main treatment course relies on the administration of proteasome inhibitors. However, over time resistance to treatment develops and most patients relapse. The mechanisms of resistance to proteasome inhibitor treatments (bortezomib and carfilzomib) are still not well understand. To investigate the correlations between genetic and genomic changes and response to proteasome inhibitor treatments over time in multiple myeloma patients, we assessed gene expression changes, copy number changes, somatic mutation changes in multiple myeloma patients from the MMRF CoMMpass study. Multiple myeloma samples were collected at diagnosis and during therapy. Most patients were molecularly
characterized at diagnosis and for around 10% of these patients additional samples obtained during therapy are also available. Collected samples were analyzed by whole genome sequencing, whole exome sequencing and RNA sequencing. Here, we used sequencing data of samples from proteasome inhibitor treated patients. The patients were grouped by response to proteasome inhibitor treatments with regard to their responses after the first line therapy. Pre-therapy samples, as well as paired pre- and on-therapy samples were analyzed separately. Through somatic mutation analysis of the pre-therapy samples, we identified 28 significantly mutated genes. In the analyses of paired pre- and on-therapy samples, patients who did not respond to therapy had a higher proportion of increased mutation load and more copy number gains or losses compared to responding patients. Gene expression analyses revealed strong upregulation of a dozen of immunoglobulin genes in the on-therapy samples. Integrative genomic analyses of preand
on-therapy samples from multiple myeloma patients and multiple myelome cell lines in this project provide insight into heterogeneous mechanisms of proteasom inhibitor resistance. |
|
Yunlong Song, Mats Steinweg, Elia Kaufmann, Davide Scaramuzza, Autonomous Drone Racing with Deep Reinforcement Learning, In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, IEEE, 2021. (Conference or Workshop Paper published in Proceedings)
|
|
Thomas Huber, Towards Real-Time Optimization-Based Visual Inertial Odometry, University of Zurich, Faculty of Business, Economics and Informatics, 2021. (Master's Thesis)
|
|
Philipp Foehn, Angel Romero, Davide Scaramuzza, Time-optimal planning for quadrotor waypoint flight, Science Robotics, Vol. 6 (56), 2021. (Journal Article)
Quadrotors are among the most agile flying robots. However, planning time-optimal trajectories at the actuation limit through multiple waypoints remains an open problem. This is crucial for applications such as inspection, delivery, search and rescue, and drone racing. Early works used polynomial trajectory formulations, which do not exploit the full actuator potential because of their inherent smoothness. Recent works resorted to numerical optimization but require waypoints to be allocated as costs or constraints at specific discrete times. However, this time allocation is a priori unknown and renders previous works incapable of producing truly time-optimal trajectories. To generate truly time-optimal trajectories, we propose a solution to the time allocation problem while exploiting the full quadrotor’s actuator potential. We achieve this by introducing a formulation of progress along the trajectory, which enables the simultaneous optimization of the time allocation and the trajectory itself. We compare our method against related approaches and validate it in real-world flights in one of the world’s largest motion-capture systems, where we outperform human expert drone pilots in a drone-racing task. |
|
Leonard Bauersfeld, Elia Kaufmann, Philipp Foehn, Sihao Sun, Davide Scaramuzza, NeuroBEM: Hybrid Aerodynamic Quadrotor Model, In: Robotics: Science and Systems (RSS), 2021, RSS, Online, 2021. (Conference or Workshop Paper published in Proceedings)
|
|