Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published 25 days ago • 194
InstructX: Towards Unified Visual Editing with MLLM Guidance Paper • 2510.08485 • Published Oct 9 • 16
OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models Paper • 2509.17627 • Published Sep 22 • 66
MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing Paper • 2505.02823 • Published May 5 • 5
MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing Paper • 2505.02823 • Published May 5 • 5 • 1
Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability Paper • 2504.08003 • Published Apr 9 • 49
When Less is Enough: Adaptive Token Reduction for Efficient Image Representation Paper • 2503.16660 • Published Mar 20 • 72