EmbeddingGemma: Powerful and Lightweight Text Representations Paper • 2509.20354 • Published Sep 24 • 41
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published 25 days ago • 193
One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models Paper • 2511.10629 • Published 23 days ago • 122
PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper • 2511.09057 • Published 25 days ago • 75
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention Paper • 2509.24006 • Published Sep 28 • 118
WMPO: World Model-based Policy Optimization for Vision-Language-Action Models Paper • 2511.09515 • Published 24 days ago • 17
LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics Paper • 2511.08544 • Published 25 days ago • 6
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation Paper • 2510.02283 • Published Oct 2 • 95
Back to Basics: Let Denoising Generative Models Denoise Paper • 2511.13720 • Published 19 days ago • 63
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published 18 days ago • 222
In-Video Instructions: Visual Signals as Generative Control Paper • 2511.19401 • Published 12 days ago • 29
Diversity Has Always Been There in Your Visual Autoregressive Models Paper • 2511.17074 • Published 16 days ago • 7
MedSAM3: Delving into Segment Anything with Medical Concepts Paper • 2511.19046 • Published 13 days ago • 48
Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation Paper • 2511.20714 • Published 12 days ago • 45