UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios Paper • 2511.18050 • Published 16 days ago • 37
view article Article We’re open-sourcing our text-to-image model and the process behind it 26 days ago • 73
Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback Paper • 2510.16888 • Published Oct 19 • 21
SwarmSys: Decentralized Swarm-Inspired Agents for Scalable and Adaptive Reasoning Paper • 2510.10047 • Published Oct 11 • 13
LucidFlux: Caption-Free Universal Image Restoration via a Large-Scale Diffusion Transformer Paper • 2509.22414 • Published Sep 26 • 20
Temporal Regularization Makes Your Video Generator Stronger Paper • 2503.15417 • Published Mar 19 • 22
STEVE: AStep Verification Pipeline for Computer-use Agent Training Paper • 2503.12532 • Published Mar 16 • 17
CoRe^2: Collect, Reflect and Refine to Generate Better and Faster Paper • 2503.09662 • Published Mar 12 • 33
Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model Paper • 2503.07703 • Published Mar 10 • 37
YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper • 2503.08638 • Published Mar 11 • 71
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice Paper • 2503.05978 • Published Mar 7 • 36
VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing Paper • 2502.17258 • Published Feb 24 • 79
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models Paper • 2410.13370 • Published Oct 17, 2024 • 37
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis Paper • 2410.08261 • Published Oct 10, 2024 • 52