UniDomain: Pretraining a Unified PDDL Domain from Real-World Demonstrations for Generalizable Robot Task Planning

Abstract

Robotic task planning in real-world environments requires reasoning over implicit constraints from language and vision. While LLMs and VLMs offer strong priors, they struggle with long-horizon structure and symbolic grounding. Existing methods that combine LLMs with symbolic planning often rely on handcrafted or narrow domains, limiting generalization. We propose UniDomain, a framework that pre-trains a PDDL domain from robot manipulation demonstrations and applies it for online robotic task planning. It extracts single domains from 12,393 manipulation videos to form an all-domain set with 3137 operators, 2875 predicates, and 16,481 causal edges. Given a target class of tasks, it retrieves relevant atomics from the all-domain set and systematically fuses them into high-quality meta-domain to support compositional generalization in planning. Experiments on diverse real-world tasks show that UniDomain solves complex, unseen tasks in a zero-shot manner, achieving up to 58% higher task success and 160% improvement in plan optimality over state-of-the-art LLM and LLM-PDDL baselines.

Highlights

Teach robots a reusable planning world from demonstrations.

Pretraining from Real-World Demonstrations

                  12,393
                  demonstrations
                

Pretrain once, then reuse symbolic knowledge for unseen tasks with zero-shot planning.

Unified Domain as Planning World

                  3,137 operators
                  2,875 predicates
                  16,481 causal edges
                

One reusable symbolic graph built from real robot demonstrations.

State-of-the-Art Performance

                  +58% success
                  +160% optimality
                

On unseen long-horizon tasks, UniDomain beats strong LLM-only and LLM-PDDL baselines.

Real-World Drink Serving

                  Dual-arm humanoid
                  robot serving
                

Users can speak what they want, and the robot prepares and serves the drink at the table.

Inside the Unified Domain

One graph, two kinds of knowledge. In the unified domain, predicate nodes in purple describe world states, while operator nodes in green describe reusable robot actions.
Causality is explicit. The edges capture the semantic relationships used in planning, including how predicates and operators connect through preconditions and effects.
Local clusters reveal reusable patterns. Nearby nodes often reflect predicates and operators that frequently appear together, exposing reusable planning motifs across tasks.

Visualization of Unified Domain — Visualization of our pre-trained unified domain, with 3,137 operator nodes (green) and 2,875 predicate nodes (purple).

Navigate in the Graph

Tips

Unified Domain (10% Sample): A sampled view of the full graph, showing the scale and connectivity learned from demonstrations.

The Vocabulary of the Graph

Broad Coverage. UniDomain spans 170 action categories, from everyday verbs like push and stir to fine-grained behaviors such as scrunch and rub. The compact meta-domain still preserves rich semantic knowledge for efficient planning.

Meta Domain

See Planning in Action

Given a scene image and a natural-language instruction, UniDomain grounds a PDDL problem and solves it into an executable plan.

Task Instruction

Move the corn from the pot into the orange bowl, wipe the table with the towel in the drawer and put it back to the closed drawer.

PDDL Problem Generated by UniDomain

Executable Plan

From Demonstrations to Planning

Pretrain symbolic domains from demonstrations. UniDomain segments each video into keyframes, proposes an initial domain, and refines it through closed-loop verification.
Fuse relevant domain fragments into a meta-domain. UniDomain retrieves the right domains and merges them into a compact planning graph for a task family.
Ground the scene and plan online. UniDomain grounds a scene image and user instruction into a PDDL problem, which is then solved into a plan.

Overview of UniDomain — UniDomain first learns reusable domains from demonstrations, then fuses the right planning graph for a task family, and finally grounds the scene to produce a plan.

State-of-the-Art Performance

Across 100 unseen long-horizon tasks in four domains, UniDomain outperforms both direct LLM/VLM planners and hybrid LLM-PDDL baselines on success, plan quality, and efficiency.

Comparison results of UniDomain and baselines — Main comparison on unseen tasks: UniDomain leads on core planning metrics while maintaining competitive runtime and fewer LLM calls among top-performing methods.

Shared legend for specific result figures

Success Rate breakdown

Why It Works

Verification keeps the learned domains usable. Without closed-loop verification, atomic domains become brittle in syntax, solvability, and task logic.
Hierarchical fusion builds a coherent planning graph. Naive union or direct LLM-only merging produces domains that do not compose cleanly.
Task-relevant grounding makes the planner much stronger. Predicate grouping and task-relevant filtering significantly improve planning performance.

Ablation study on domain generation — Closed-loop verification and hierarchical fusion are essential for building usable atomic domains and compact meta-domains.

Ablation study of the UniDomain planner — Predicate grouping and task-relevant filtering significantly improve planning performance on compositional tasks.

From Language to a Drink on the Table

UniDomain can be seamlessly integrated into a real robot system. In our drink-making setup, a dual-arm humanoid robot takes spoken requests, reasons over ingredients and preparation steps, and serves the finished drink to the user.

The user can customize what to make from available ingredients such as tea, milk, water, floral teas, and fruit syrups including mango, lychee, and kumquat lemon, rather than choosing from a single fixed recipe.

Example request: “Make me a cup of milk tea with mango juice.”

BibTeX

@inproceedings{NEURIPS2025_b8358a00,
 author = {Ye, Haoming and Xiao, Yunxiao and Lu, Cewu and Cai, Panpan},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {D. Belgrave and C. Zhang and H. Lin and R. Pascanu and P. Koniusz and M. Ghassemi and N. Chen},
 pages = {126668--126713},
 publisher = {Curran Associates, Inc.},
 title = {UniDomain: Pretraining a Unified PDDL Domain from Real-World Demonstrations for Generalizable Robot Task Planning},
 url = {https://proceedings.neurips.cc/paper_files/paper/2025/file/b8358a00e5b870194b974ddf8dd415c3-Paper-Conference.pdf},
 volume = {38},
 year = {2025}
}