Kaizhe Hu, Haochen Shi, and their research team have introduced the "Robot-Trains-Robot" (RTR) framework, a novel approach designed to overcome significant challenges in real-world humanoid robot learning. This new system, detailed in a paper presented at CoRL 2025, enables humanoid robots to acquire complex motor skills with substantially reduced human oversight, drawing inspiration from how humans learn through hands-on guidance.
Traditional simulation-based reinforcement learning has advanced humanoid locomotion, but transferring these skills to the physical world often faces a "sim-to-real gap," alongside safety concerns, complex reward design, and efficiency issues. The RTR framework directly addresses these limitations by proposing a system where a robotic arm acts as a teacher, actively supporting and guiding a humanoid robot student. As Kaizhe Hu stated in a tweet, > "How do we learn motor skills directly in the real world? Think about learning to ride a bike—parents might be there to give you hands-on guidance. Can we apply this same idea to robots?"
The RTR system provides comprehensive support, including protection mechanisms, learning schedules, reward signals, controlled perturbations, failure detection, and automatic resets. This integrated approach allows for "efficient long-term real-world humanoid training with minimal human intervention," according to the researchers. Furthermore, the framework incorporates a novel reinforcement learning pipeline that optimizes a dynamics-encoded latent variable, stabilizing the transfer of policies from simulation to reality.
The efficacy of RTR has been validated through challenging real-world tasks. The framework successfully fine-tuned a walking policy for precise speed tracking and enabled a humanoid to learn a swing-up task from scratch, showcasing the "promising capabilities of real-world humanoid learning realized by RTR-style systems." This advancement, developed by a research team at Stanford University including Kaizhe Hu, Haochen Shi, Yao He, Weizhuo Wang, C. Karen Liu, and Shuran Song, represents a significant step towards more autonomous and capable humanoid robots. As Professor Shuran Song noted, this approach aims to train robots > "like a real toddler—guiding and supporting it as it learns, so it can master real-world skills safely and effectively!" This development could accelerate the deployment of humanoids in diverse real-world applications where minimizing human intervention is crucial.