Decision SpikeFormer: Spike-Driven Transformer for Decision Making

CVPR 2025

Wei Huang1,2, Qinying Gu1†, Nanyang Ye3†
1Shanghai AI Laboratory, 2Wuhan University, 3Shanghai Jiao Tong University
indicates corresponding author

Spike-driven Self-Attention Mechanism

Teaser Image

Architecture for offlineRL

Teaser Image

PTBN

Teaser Image

Abstract

Offline reinforcement learning (RL) enables policy training solely on pre-collected data, avoiding direct environment interaction—a crucial benefit for energy-constrained embodied AI applications. Although Artificial Neural Networks (ANN)-based methods perform well in offline RL, their high computational and energy demands motivate exploration of more efficient alternatives. Spiking Neural Networks (SNNs) show promise for such tasks, given their low power consumption. In this work, we introduce DSFormer, the first spike-driven transformer model designed to tackle offline RL via sequence modeling. Unlike existing SNN transformers focused on spatial dimensions for vision tasks, we develop Temporal Spiking Self-Attention (TSSA) and Positional Spiking Self-Attention (PSSA) in DSFormer to capture the temporal and positional dependencies essential for sequence modeling in RL. Additionally, we propose Progressive Threshold-dependent Batch Normalization (PTBN), which combines the benefits of LayerNorm and BatchNorm to preserve temporal dependencies while maintaining the spiking nature of SNNs. Comprehensive results in the D4RL benchmark show DSFormer’s superiority over both SNN and ANN counterparts, achieving 78.4% energy savings, highlighting DSFormer's advantages not only in energy efficiency but also in competitive performance.

BibTeX


@InProceedings{huang2025decisionspike,
  author    = {Wei Huang, Qinying Gu, Nanyang Ye},
  title     = {Decision SpikeFormer: Spike-Driven Transformer for Decision Making},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2025},
  month     = {June}
}