American Society of Mechanical Engineers

Times are displayed in (UTC-06:00) Central Time (US & Canada) Change

Session: ASME Undergraduate Student Design Expo

Paper Number: 175765

Robust Navigation for Ground Robots Under Partial Observability: A Hybrid Ppo-Lstm and Llm Framework for the Husky A200

The goal of this project is to improve autonomous robot navigation in real-world, harsh, and partially observable environments by developing and testing a quantum-enhanced robot learning framework that enables mobile robots to adapt to degraded sensor conditions and dynamic obstacles more effectively than traditional methods. Through autonomous navigation, robots can map their surroundings while navigating in real-world environments. Tasks that are not safe for humans can be handled through mobile robotics, such as moving through disaster zones, quarantine zones, and other planets. These capabilities are beneficial to search and rescue teams, hospital staff, and interplanetary researchers since the robot can navigate independently while collecting data or transporting materials. Methods such as Adaptive Monte Carlo Localization (AMCL) assume constant sensor inputs and thrive in ideal conditions. However, in harsher applications, a robot could find itself navigating under less ideal conditions, such as limited visibility leading to partial observability. An example of this in Laser SLAM mapping is errors that occur due to a lack of solid landmarks when navigating. Dynamic objects, reflective surfaces, and empty areas don’t provide reliable landmarks for navigating and can cause sensor degradation making navigation less reliable and increasing collisions and mapping errors.

To address this problem, we proposed utilizing a Deep Reinforcement Learning (DRL) framework, integrating Proximal Policy Optimization (PPO) combined with Long Short-Term Memory (LSTM) networks as well as a Large Language Model (LLM) into the navigation process of the Husky mobile robot, which would allow the robot to adapt to the conditions continuously through degraded laser scans. DRL functions through adaptive trial and error, rewarding the system when a task is completed while punishing the system after collisions. The Quantum PPO-LSTM Agent (QSNE) combines the PPO and LSTM networks, where PPO is the reinforcement learning algorithm that makes movement decisions based on the ongoing sensor data. LSTM is the neural network that allows the robot to remember past information. These combined components allow the robot to learn from mistakes and improve navigation based on past experiences. This information is fed into the QSNE through a Parameterized Quantum Circuit (PQC), which converts raw sensor data to real measurements by utilizing entangling gates to capture correlations between features shown in separate scans, providing heightened map accuracy. The LLM analyzes summarized laser scans and odometry to produce semantic insights, such as obstacle positions and navigation actions, which guide the PPO-LSTM agent in noisy or sensor-flickering environments.

Experiments conducted consisted of Gazebo Simulated experiments and real-world experiments. The Gazebo Simulated experiments included four different environments: two indoors and two outdoors. Artificial noise and flickering were added to the sensor with 50% probability of each type. Then the navigation was tested with standard AMCL navigation and our proposed DRL framework navigation. The real-world experiment consisted of 3 environments: one indoors and two outdoors. These included the same artificial noise and flickering but also added the challenge of dynamic motion from individuals or vehicles moving around the robot’s path, which tested the robot's reaction. In the experiment conducted indoors, the robot had an 80% success rate with baseline AMCL navigation, and an 85% success rate when the DRL framework was added. The second experiment, conducted in front of a building, started with a 55% success rate using baseline AMCL navigation, while reaching a 75% success rate when the DRL framework was added. The final experiment conducted in an open parking lot started at a 20% success rate with standard AMCL navigation, jumping to a 65% success rate with the application of the DRL framework. Results highlight shorter path length, time to reach target location, and collisions with objects per trial in every test when utilizing the DRL framework, contrary to baseline navigation. When performing in dynamic environments, the robot was able to navigate around pedestrians or completely stop to avoid collisions when an obstacle suddenly appeared in the path, when utilizing the DRL framework.

Presenting Author: Caiden Sivak Florida Institute of Technology

Presenting Author Biography: Junior Mechanical Engineering student at Florida Institute of Technology. Currently working towards a future career in Robotics. Currently working with two professors, including a robotics lab and a laser thermal testing lab. Experience with traditional mechanical design and modeling through coursework, while expanding knowledge in software/AI and thermal testing through lab research.

Authors:

Caiden Sivak Florida Institute of Technology
Truong Nhut Huynh Florida Intitute of Technology
Samantha Bullard Florida Institute of Technology
Kim-Doang Nguyen Florida Institute of Technology

Robust Navigation for Ground Robots Under Partial Observability: A Hybrid Ppo-Lstm and Llm Framework for the Husky A200

Paper Type

Undergraduate Expo