Session: 17-01-01: Research Posters
Paper Number: 150609
150609 - Decentralized Near-Optimal Control of Multiple Drones Using Snac
There are many potential applications of swarm drones, such as defense, homeland security, search & rescue, and disaster management. The drones’ small size and expendability make them ideal for missions in hazardous environments without risking human life. This research focuses on decentralized optimal control of multiple drones following a designated flight path. The flight path is only known to the leader drone. The remaining drones are to indirectly follow the flight path using the positions of the other drones.
Optimal control finds the control signals by minimizing a cost function. Finding the exact optimal control is challenging due to the complexities of solving the underlying Hamilton-Jacobi-Bellman equation (HJB), which provides the necessary and sufficient conditions for optimality. Instead of an exact solution, we find the near-optimal control solution using the Single Network Adaptive Critic (SNAC). SNAC approximates the optimal costates, i.e., the gradient of the value functions, that solve the HJB using Neural Networks (NN). Then, SNAC uses iterative schemes from Reinforcement Learning (RL) to tune the parameters of the NN.
There are two levels of control: position control and attitude control. The position controller forms the outer loop, and the attitude controller forms the inner loop. For each of these controllers, SNAC was used to learn the optimal behavior. The position controller receives the linear position and velocity errors and generates the required velocities to reach the desired position. The desired velocities are then sent to a system solver along with a selected yaw angle to find the desired angular positions and the lift force, which are later sent to the attitude controller to generate the torques required to operate the vehicle. This control is the same for each drone, but the input reference position is different. The reference position for the leader drone is the position of the reference signal at that timestep, and the reference position for the other drones is the drone position.
The control method showed promising results as four drones, starting from rest, were able to sync up to a generated reference signal that was propagated through time. Although only the leader drone had knowledge of the signal, the other drones were able to smoothly follow the trajectory by using the other drones’ states.
The next phase of this work is further refining the control method. This will be followed by implementation and experimentation of this control method on actual drones to replicate the simulation results.
Presenting Author: Haniel Youlesivanson California State University Northridge
Presenting Author Biography: Haniel is a graduate student in the Department of Mechanical Engineering at California State University Northridge. His research interests are optimal control, reinforcement learning, and multiagent systems.
Authors:
Haniel Youlesivanson California State University NorthridgeTohid Sardarmehni California State University Northridge
Decentralized Near-Optimal Control of Multiple Drones Using Snac
Paper Type
Poster Presentation