DecisionHiFormer

Less is More: an Attention-free Sequence Prediction Modeling for Offline Embodied Learning

NeurIPS 2025

Wei Huang^1,2, Jianshu Zhang³, Leiyu Wang³, Heyue Li⁴, Luoyi Fan³, Yichen Zhu⁵,
Nanyang Ye^3†, Qinying Gu^1†

¹Shanghai AI Laboratory ²Tsinghua University,
³Shanghai Jiao Tong University, ⁴Wuhan University, ⁵Midea Group,
^†indicates corresponding author

Abstract

Offline reinforcement learning (offline RL) is increasingly approached as a sequence modeling task, with methods leveraging advanced architectures like Transformers to capture trajectory dependencies. Despite significant progress, the mechanisms underlying their effectiveness and limitations remain insufficiently understood. We conduct a thorough analysis on the representative Decision Transformer (DT) model using an entropy analysis and identify the inconsistencies in state-action-reward (s, a, R) distributions causing attention ``dispersal". To address this, we propose a hierarchical framework that decomposes sequence modeling into intra-step relational modeling—handled by a Token Merger that fuses each (s, a, R) triplet—and inter-step modeling—handled by a Token Mixer across timesteps. We investigate several Token Merger designs and validate their effectiveness across various offline RL methods. Furthermore, our theoretical analysis and experimental results suggest that while Token Mixers are important, lightweight architecture can also achieve even better performance to more complex ones. We therefore propose a parameter-free Average Pooling Token Mixer, which, combined with a convolutional Token Merger, forms our final model, Decision HiFormer (DHi). DHi achieves a 73.6% improvement in inference speed and an 9.3% gain in policy performance on the D4RL benchmark compared to DT. DHi also generalizes well to real-world robotic manipulation tasks, offering both practical benefits and insights into sequence-based policy design for offline RL.

BibTeX

@InProceedings{huang2025decisionhiformer, author = {Wei Huang and Jianshu Zhang and Leiyu Wang and Heyue Li and Luoyi Fan and Yichen Zhu and Nanyang Ye and Qinying Gu}, title = {Less is More: an Attention-free Sequence Prediction Modeling for Offline Embodied Learning}, booktitle = {Advances in Neural Information Processing Systems}, year = {2025}, }

Less is More: an Attention-free Sequence Prediction Modeling for Offline Embodied Learning

NeurIPS 2025

Entropy-reward Correlation

Architecture for offlineRL

Real World

Visualization

Abstract

BibTeX