[论文速览]VLMs are Zero-Shot Reward Models for RL[2310.12921]
发布人