WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG Paper • 2603.23497 • Published 10 days ago • 90
Perceptio: Perception Enhanced Vision Language Models via Spatial Token Generation Paper • 2603.18795 • Published 15 days ago • 16
VIDEOP2R: Video Understanding from Perception to Reasoning Paper • 2511.11113 • Published Nov 14, 2025 • 112
LRM: Large Reconstruction Model for Single Image to 3D Paper • 2311.04400 • Published Nov 8, 2023 • 52
MVDream: Multi-view Diffusion for 3D Generation Paper • 2308.16512 • Published Aug 31, 2023 • 106