Neural Dynamics Augmented Diffusion Policy

Abstract

Move mouse over text to expand.

Overview

(Left) We propose neural dynamics augmented diffusion policy, with few-shot diffusion policy robustly covering a local supporting region and a dynamics model extending the initial configuration space.
(Right) The proposed method has demonstrated its performance in various tasks.

Method

Our Proposed Framework:

(a) Collecting few-shot human demonstrations that could cover a convex hull in the space.
(b) Diffusion policy trained on the few-shot human demonstrations is robust in the local supporting region, but lacks robustness in outside configurations.
(c) Model-based planning equipped with dynamics models generates manipulation trajectories from various initial poses to the supporting region.
(d) The whole policy leveraging trajectories generated in (c) is robust in the large space.

Real-World Results of InsertT

Inserting a T-Shape (initially put on a random position with a random orientation on the table) into a slot. The manipulation succeeds when the T is successfully inserted into the slot.

Real-World Results of Stow

Stowing a book (initially put on a random position with a random orientation on the table) onto the bookshelf with a few books. The manipulation succeeds when all books are placed in upright posture.

Real-World Results of DustPan

Sweeping sparsely located granular into the dustpan. The evaluation metric in simulation is defined by the ratio of granular successfully swept into the dustpan, with success in the real world determined by sweeping a certain percentage 90% of pieces into the dustpan.

Real-World Results of HangMug

HangMug: hanging a mug (randomly positioned on the table) on the rack. The manipulation succeeds when the mug is successfully hung on the rack.

Analysis

Qualitative Analysis on InsertT:

With the same few-shot human demonstrations, while the original diffusion policy demonstrates robustness only in a certain local region, our proposed method supports the policy robustness in a much wider space with the augmentation of the neural dynamics model.
The demonstrations in Diffusion Policy show the few-shot (10) human demonstrations.
Those in Ours show dynamics augmented demonstrations cover the large space.

Qualitative Analysis on DustPan, Stow and HangMug:

While diffusion policy only covers specific regions, our method covers a significantly larger space with model-based planning to manipulate diverse objects among different tasks into the local supporting region, followed by the few-shot diffusion policy.
For DustPan, "Planning" denotes this step is fulfilled by model-based planning.
For HangMug, red denotes success and blue denotes failure.

Bibtex

		@inproceedings{wu2025neural,
			title={Neural Dynamics Augmented Diffusion Policy},
			author={Wu, Ruihai and Chen, Haozhe and Zhang, Mingtong and Lu, Haoran and Li, Yitong and Li, Yunzhu},
			booktitle={IEEE International Conference on Robotics and Automation (ICRA)},
			year={2025}
		}