Policy learning focuses on devising strategies for agents in embodied artificial intelligence systems to perform optimal actions based on their perceived states. One of the key chal- lenges in policy learning involves handling complex, long- horizon tasks that require managing extensive sequences of actions and observations with multiple modes. Wavelet analysis offers significant advantages in signal processing, notably in decomposing signals at multiple scales to capture both global trends and fine-grained details. In this work, we introduce a novel wavelet policy learning framework that utilizes wavelet transformations to enhance policy learning. Our approach leverages learnable multi-scale wavelet de- composition to facilitate detailed observation analysis and robust action planning over extended sequences. We detail the design and implementation of our wavelet policy, which incorporates lifting schemes for effective multi-resolution analysis and action generation. This framework is evalu- ated across multiple complex scenarios, including robotic manipulation, self-driving, and multi-robot collaboration, demonstrating the effectiveness of our method in improving the precision and reliability of the learned policy.
An illustration of wavelet policy network with a multi-scale lifting scheme, where an observation sequence (left) is processed through splitting, prediction (P), and update (U) steps recursively, with converters and fusers recombining components to generate a corresponding action sequence (right). Blue lines and arrows indicate low-frequency (coarse scale) components while red ones indicate high-frequency (fine-scale) components. Here, we only plot two scales for illustration.
Completed tasks in Kitchen: top burner, kettle, hinge cabinet, light switch, slide cabinet.
Right and left turns in CARLA.
Push-T: pushing a T-shaped block (gray) to a fixed target (green) with a circular end-effector (blue).
Transport: trained on proficient human (PH) teleoperated demonstrations.
Transport: trained on mixed proficient/non-proficient human (MH) demonstration.
Aligning: push a hollow box to a predefined target position and orientation.
Avoiding: reaching the green finish line without colliding with one of the six obstacles.
Pushing: pushing two blocks to fixed target zones.
Sorting: sorting red and blue blocks to their color-matching target box.
@article{huang2025wavelet, title={Wavelet Policy: Lifting Scheme for Policy Learning in Long-Horizon Tasks}, author={Huang, Hao and Yuan, Shuaihang and Bethala, Geeta Chandra Raju and Wen, Congcong and Tzes, Anthony and Fang, Yi}, journal={arXiv preprint arXiv:2507.04331}, year={2025} }