Session: ASME Undergraduate Student Design Expo
Paper Number: 176048
Autonomous Mobile Manipulation With String-Based Mapping and Llms
This research aims to develop an autonomous indoor mobile manipulation framework that combines LLM-driven control and a novel string-based mapping approach to overcome partial observability and enable robots to navigate, interact with objects, and perform tasks in dynamic, real-world environments.
Autonomous mobile manipulation enables robots to navigate indoor spaces and manipulate their environment, making these tasks a critical area of research. Partial observability refers to the challenge where a robot lacks complete information about its surroundings due to limitations in sensors, occlusions, or dynamic changes in the environment. For robots functioning in indoor environments, they must navigate and interact with objects to accomplish specific tasks. The overall goal of this experiment is to develop a mobile manipulation framework to allow a robot to autonomously navigate an indoor environment, avoid obstacles, locate items, perform interactions, and return to the lab area. The Stretch mobile manipulator was used in our experiments, which is capable of performing a wide range of tasks. The arm and gripper can extend, raise, and rotate, which allows the robot to perform actions such as picking up and placing items, pressing buttons, and grasping lightweight items securely. This project advances robotics and engineering in autonomy through work in the Robot Operating System (ROS), LiDAR data processing, and mapping indoor spaces using SLAM (Simultaneous localization and mapping) and FUNMAP (Fast Unified Navigation, Mapping, and Planning), while attempting to bridge the research gap of partial observability.
Our proposed framework makes significant contributions to the field of autonomous robotics by advancing both perception and control mechanisms for mobile manipulation in indoor environments under conditions of partial observability. The methodology integrates a Large Language Model (LLM) as a high-level planner and interpreter of natural language instructions, enabling intuitive human-robot interaction and flexible task execution. The LLM translates unstructured language inputs into structured navigation and manipulation commands, effectively bridging the gap between human intent and robotic action.
A central innovation of this research lies in the introduction of a string-based spatial representation technique. In this approach, the environment is decomposed into interconnected linear segments, or "strings", that abstractly represent spatial features such as paths, boundaries, and object locations. This representation not only simplifies the robot’s internal map but also offers robustness against sensor degradation, flickering, and occlusions by allowing the robot to infer environmental structure even with incomplete or noisy data. These strings dynamically update as the robot moves and senses, forming a flexible, evolving network that enhances situational awareness and adaptability.
By combining this string-based mapping with LLM-driven decision-making and real-time LiDAR and SLAM-based localization (via FUNMAP), the system enables the mobile robot to navigate autonomously, identify and interact with task-relevant objects, and adapt its behavior in response to environmental uncertainty. The transformative potential of this research lies in its ability to bridge classical geometric mapping with semantic understanding and language-based control, laying the groundwork for more generalizable, robust, and human-interactive robotic systems capable of operating reliably in real-world, partially observable indoor environments.
Preliminary results demonstrate robust performance in object detection and segmentation, with an average precision of 92% for common indoor objects and 60% for rare objects like water fountains, despite limited datasets. Inside the laboratory, the FUNMAP framework achieved successful autonomous navigation with a 95% success rate in obstacle and path planning across 20 test runs. Current efforts aim to extend FUNMAP to the hallway to expand the navigation area, with early tests showing success runs in navigating dynamic hallway conditions. The next step involves transforming all observed environmental data, including sensor inputs, into interconnected strings to enhance the FUNMAP framework and refine the LLM to handle a broader range of hallway-specific navigation instructions. Extensive hallway tests will validate these advancements to ensure reliable performance in larger, dynamic, real-world indoor settings.
Presenting Author: Samantha Bullard Florida Institute of Technology
Presenting Author Biography: A junior 4.0 GPA student in Aerospace Engineering with a research focus on autonomous robotics and artificial intelligence. Currently contributing to the development of an autonomous mobile manipulation framework, attempting to bridge the research gap of partial observability. In addition to academic and research commitments, she serves as a captain of the varsity swim team at Florida Tech.
Authors:
Samantha Bullard Florida Institute of TechnologyCaiden Sivak Florida Institute of Technology
Truong Nhut Huynh Florida Institute of Technology
Kim-Doang Nguyen Florida Institute of Technology
Autonomous Mobile Manipulation With String-Based Mapping and Llms
Paper Type
Undergraduate Expo