Abstract

This work introduces LIBERO-Mem, a non-Markovian task suite for stress-testing robotic manipulation under object-level partial observability. The benchmark tests whether policies can track object identities, histories, and temporally sequenced subgoals over time.

The paper also presents Embodied-SlotSSM, a slot-centric VLA model that maintains temporally consistent object representations and uses them for memory-aware action prediction in long-horizon manipulation.