Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms Paper • 2604.23775 • Published 7 days ago • 44
Gated Condition Injection without Multimodal Attention: Towards Controllable Linear-Attention Transformers Paper • 2603.27666 • Published Mar 29 • 18
Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models Paper • 2603.15557 • Published Mar 16 • 29
ViFeEdit: A Video-Free Tuner of Your Video Diffusion Transformer Paper • 2603.15478 • Published Mar 16 • 24