Xiaohei's Blog
headpicBlur image

Before We Begin#

Since I’ve had some free time recently, I revisited the papers and technical notes I collected when I first started doing UAV research. I reorganized them systematically and also built my own mind map.

When reading papers and running open-source code, notes are often very fragmented. This time, I tried to take a higher-level view and reconnect them around method paradigms, classic tasks, platform design, and the underlying tech stack, as a quick-reference handbook for research and engineering.

Data-driven vs Model-based#

In the UAV field, data-driven methods and model-based/modular methods each have strengths in different tasks, and they are still competing head-to-head. They are not an either-or choice; rather, each fits different levels of task difficulty.

Why are traditional modular approaches still so strong? Mainly because UAV dynamics models (especially for quadrotors) are not only relatively simple in engineering practice, but also easy to calibrate in the real world. In addition, most UAV tasks are about “moving through the environment” rather than “creating strong physical interaction”. With mature state machines, trajectory planning, and low-level control optimization, you can already achieve excellent flight performance. Most commercial drones we see day to day are still built on this foundation.

Where does end-to-end learning win? Traditional pipelines rely on very accurate state estimation and perception modeling. But on small platforms, constrained by compute, payload, and sensor noise, the whole pipeline is often pushed to its limits. In such cases, learning-based methods (especially using reinforcement learning for perception-driven agile flight) can bypass cumbersome explicit mapping and state derivation, showing reaction speed and robustness beyond the traditional route.

Simulators for UAV RL#

For deep learning/reinforcement learning, a good simulator is essential. In the UAV domain, a few high-frequency “productivity tools” show up again and again:

  • AirSim: Based on Unreal Engine (UE4/UE5). Great visuals and very realistic dynamics. However, making low-level changes has a relatively high barrier, and the runtime frame rate is a bit low for large-scale RL training.
  • Flightmare: The main feature is speed, very suitable for RL tasks that require massive data sampling.
  • AerialGym: A class of environment wrappers highly customized for reinforcement learning, especially popular in Sim2Real (simulation-to-reality transfer) research.

Classic Skills and Representative Works#

This section mainly introduces data-driven methods for classic tasks. It is worth noting that among the works below, some approaches reduce or eliminate reliance on SLAM systems and odometry. Interestingly, the initial rise of UAV autonomy benefited greatly from the increasing maturity of SLAM/odometry systems, so this could become an interesting direction for UAV skill learning.

Obstacle Avoidance in Unknown Environments and Agile Flight#

How can a UAV weave through forests full of unknown obstacles, rubble, or narrow corridors? This is a highly representative challenge. From early days to now, many clever approaches have been proposed.

Inspired by autonomous driving, CMU tried supervised learning in ICRA 2013 to map monocular images directly to discrete control commands. Later, UCB’s CAD2RL appeared, training entirely in simulation on monocular RGB images combined with Domain Randomization, and successfully flew in a real corridor.

Then, work from the University of Zurich (UZH) pushed this direction to a new peak:

  • DroNet source code: Cleverly leveraged autonomous-driving datasets to teach UAVs to output velocity commands.
  • Agile Autonomy project: Published in SciRob. The core idea is to use the DAgger algorithm to fuse expert data from traditional trajectory planning, arguing that the extremely low latency of end-to-end networks can greatly raise the flight-speed limit in unknown environments.

Domestic universities have also produced very impressive work in this direction. For example, Shanghai Jiao Tong University’s team (Back to Newton’s Laws: Learning Vision-based Agile Flight via Differentiable Physics) proposed using a differentiable physics model to provide first-order gradients for policy optimization, removing dependence on explicit position/velocity estimation. The paper uses low-resolution depth images; for obstacle avoidance it trains more efficiently than RL and achieves high-speed flight.

Similarly, Zhejiang University’s FAST Lab combined reinforcement learning with onboard radar to achieve extreme autonomous obstacle avoidance. In their latest Flying on Point Clouds with Reinforcement Learning, they use onboard radar and sim2real RL to realize autonomous avoidance. Although learning-based methods are advancing rapidly, if you walk around real engineering projects you will find that traditional trajectory-planning approaches such as Ego-Planner are still the backbone. The reason is simple: they are reliable enough in most scenarios and make debugging straightforward, while the data closed-loop and verification cost of end-to-end approaches is still a significant hurdle.

Representative Works for Other Classic Tasks#

  1. UAV target recognition and pursuit
  1. Autonomous exploration without a prior map
  1. UAV racing and high-maneuver / aerobatic flight

Novel UAV Configuration Design#

Beyond making algorithms smarter, many people are also trying to combine UAVs with manipulators or give them morphing capabilities. With these hardware innovations, the task boundaries of UAVs are significantly expanded.

Aerial Manipulator#

An aerial manipulator (also called an aerial manipulation UAV) combines a UAV’s fast spatial mobility with a manipulator’s precise manipulation ability, making it an ideal carrier for embodied intelligence. It can fly and also grasp objects and manipulate items.

Fully-Actuated UAV#

Common quadrotor UAVs are underactuated, meaning position and attitude are coupled. A fully-actuated UAV with decoupled position-and-attitude control is theoretically more suitable as a flight platform for aerial manipulation.

Deformable UAV#

Multi-Modal UAV#

Focuses on configuration design, motion control, and autonomous navigation for multi-modal UAVs. Multi-modal UAVs can operate across multiple domains such as air, ground, and underwater. This can not only address endurance limitations, but also expand application potential.

Key Technical Solutions#

For any UAV, how fast and how stable it can fly is fundamentally determined by the state estimation (localization) system. This inevitably leads to the most familiar Odometry & Simultaneous Localization and Mapping (Odometry & SLAM) technologies.

Odometry provides real-time localization for robots. It is often implemented with Extended Kalman Filtering (EKF), fusing observations from IMU, cameras, LiDAR, encoders, millimeter-wave radar, optical flow sensors, and many other sensors commonly used for robot pose perception, to estimate the robot pose at a high frequency.

In the Visual-Inertial Odometry (VIO) field, one of the most classic representatives is HKUST’s VINS-Mono / VINS-Fusion project.

For LiDAR-Inertial Odometry (LIO), HKU’s series of work is a benchmark, from the classic LOAM, to the widely popular FAST-LIO, and then FAST-LIVO2, pushing real-time mapping and localization efficiency to new heights step by step.

In addition, for long-term flight in large environments, a SLAM system with loop closure (Simultaneous Localization and Mapping) is also indispensable front-end and back-end infrastructure.

SLAM (Simultaneous Locolization And Mapping) builds a map while localizing, making loop closure detection possible. With loop closure, when a robot revisits a location it can correct some accumulated errors and improve localization accuracy during long-duration operation. SLAM is mainly implemented in two styles: filter-based and optimization-based. In practice, it is typically divided into a front end and a back end, and SLAM based on different sensors has its own characteristics.

Beyond mapping algorithms, some general robot development tools are also must-have staples for UAV R&D:

  • The classic ROS / ROS2 ecosystem, especially timestamp alignment across multiple sensors (e.g., message_filters’s TimeSynchronizer).
  • In strong-dynamics scenarios such as aerial manipulators, people also often use solver libraries such as NVIDIA’s cuRobo (CUDA-accelerated collision checking and planning), IKFast, or mplib in the ManiSkill ecosystem.

Outlook#

After organizing all of this, my core takeaway is that Sim2Real (simulation-to-reality transfer) may be one of the easiest routes for individual developers or small teams to enter and obtain tangible results at this stage. In other words, UAVs also need to move toward the transition from physical intelligence to embodied intelligence. I will continue to dig deeper into this part, and will write another blog post to summarize it.

UAV Review
https://xiaohei-blog.vercel.app/en/blog/uav-review
Author 红鼻子小黑
Published at June 27, 2025
Comment seems to stuck. Try to refresh?✨