Progress Report
Self-Evolving AI Robot System for Lunar Exploration and Human Outpost Construction[1] Realization of Modular Multi-Agent Robot System
Progress until FY2024
1. Outline of the project
Given the strict limitations on transport volume and mass to space, robotic systems delivered from Earth must be highly adaptable—capable of reconfiguring themselves to suit varying tasks and environments. Moreover, coordination among multiple robots with different forms is essential.
This project develops a reconfigurable modular robot system capable of operating in harsh lunar conditions, such as vacuum, low gravity, and regolith terrain, aiming to achieve decentralized cooperation among heterogeneous robots.
2. Outcome so far
In FY2024, research and development efforts were focused on the following two key themes:

Theme 1: Design, Fabrication, and Functional Analysis of Modular Robots
We conducted the design, prototyping, and functional evaluation of modular robots, which form the core of this project and are capable of dynamically reconfiguring to perform various tasks. Through the step-by-step development of ground-based test models, the following achievements were made.
- 1. Repository for Structure and Control
- A reusable knowledge base was established by compiling a database of module configurations, forms, functions, and corresponding control methods.
- 2. Design and Fabrication of Connection Mechanisms
- Connection interfaces were designed to allow easy reconfiguration of modules, and ground test models incorporating these interfaces were fabricated.
- 3. Development of Reconfiguration Algorithms
- Algorithms were developed and implemented to enable recognition of reconfigured structures and appropriate task allocation based on the module arrangement.
- 4. Development of Plug-and-Play Mechanisms
- A flexible system was constructed to dynamically detect and manage the connection and disconnection of modules.
- 5. Construction of a Decentralized Cooperative Control System
- A control architecture was designed and partially implemented to enable cooperative behavior among heterogeneous modular robots.

Theme 2: Development of Distributed AI Using Hierarchical Reinforcement Learning
Using MoonBot model files, we built a simulation environment for an integrated robot with arm, hand, and mobility modules. We developed and applied a hierarchical reinforcement learning system for lunar manipulation tasks, consisting of the following components:
- Lower-Layer Modules (Low-Layer Learning):
- Arm Module: Learned reaching motions to bring the end-effector to target positions.
- Gripper Module: Learned grasping behaviors to securely hold objects.
- Rover Module: Learned locomotion strategies to move the robot to designated locations.
- Upper-Layer Modules (High-Layer Learning):
- Learned integrated policies that coordinate lower-level modules to perform composite tasks, including navigating to the worksite and adjusting the end-effector to optimal angles and positions.


Through this hierarchical structure, the system was able to efficiently acquire complex skills such as grasping, locomotion, and manipulation. As a result, it successfully completed the target tasks, including reaching designated positions and executing manipulation actions. In contrast, conventional reinforcement learning without a hierarchical structure failed to sufficiently learn and complete the same tasks.
These results demonstrate that the proposed AI system—combining modular architecture with hierarchical reinforcement learning—is effective and efficient for acquiring task policies for complex operations involving both locomotion and manipulation.
3. Future plans
To achieve the Moonshot goal of autonomous lunar infrastructure construction using AI robot teams, we will further advance research on flexible robot reconfiguration, optimal task allocation, and hierarchical learning-based skill acquisition and utilization.