Robotic task planning in real-world environments requires reasoning over implicit constraints from language and vision. While LLMs and VLMs offer strong priors, they struggle with long-horizon structure and symbolic grounding. Existing methods that combine LLMs with symbolic planning often rely on handcrafted or narrow domains, limiting generalization. We propose UniDomain, a framework that pre-trains a PDDL domain from robot manipulation demonstrations and applies it for online robotic task planning. It extracts single domains from 12,393 manipulation videos to form an all-domain set with 3137 operators, 2875 predicates, and 16,481 causal edges. Given a target class of tasks, it retrieves relevant atomics from the all-domain set and systematically fuses them into high-quality meta-domain to support compositional generalization in planning. Experiments on diverse real-world tasks show that UniDomain solves complex, unseen tasks in a zero-shot manner, achieving up to 58% higher task success and 160% improvement in plan optimality over state-of-the-art LLM and LLM-PDDL baselines.
Teach robots a reusable planning world from demonstrations.
Pretrain once, then reuse symbolic knowledge for unseen tasks with zero-shot planning.
One reusable symbolic graph built from real robot demonstrations.
On unseen long-horizon tasks, UniDomain beats strong LLM-only and LLM-PDDL baselines.
Users can speak what they want, and the robot prepares and serves the drink at the table.
Given a scene image and a natural-language instruction, UniDomain grounds a PDDL problem and solves it into an executable plan.
Across 100 unseen long-horizon tasks in four domains, UniDomain outperforms both direct LLM/VLM planners and hybrid LLM-PDDL baselines on success, plan quality, and efficiency.
UniDomain can be seamlessly integrated into a real robot system. In our drink-making setup, a dual-arm humanoid robot takes spoken requests, reasons over ingredients and preparation steps, and serves the finished drink to the user.
The user can customize what to make from available ingredients such as tea, milk, water, floral teas, and fruit syrups including mango, lychee, and kumquat lemon, rather than choosing from a single fixed recipe.
@inproceedings{ye2025unidomain,
title={UniDomain: Pretraining a Unified {PDDL} Domain from Real-World Demonstrations for Generalizable Robot Task Planning},
author={Haoming Ye and Yunxiao Xiao and Cewu Lu and Panpan Cai},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
}