arxiv:2512.05150

TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows

Published on Dec 3

· Submitted by

Zhenglin Cheng (SII) on Dec 8

#1 Paper of the day

inclusionAI

Upvote

Authors:

Abstract

TwinFlow is a 1-step generative model framework that enhances inference efficiency without requiring fixed pretrained teacher models or standard adversarial networks, achieving high performance on text-to-image tasks and scaling efficiently.

AI-generated summary

Recent advances in large multi-modal generative models have demonstrated impressive capabilities in multi-modal generation, including image and video generation. These models are typically built upon multi-step frameworks like diffusion and flow matching, which inherently limits their inference efficiency (requiring 40-100 Number of Function Evaluations (NFEs)). While various few-step methods aim to accelerate the inference, existing solutions have clear limitations. Prominent distillation-based methods, such as progressive and consistency distillation, either require an iterative distillation procedure or show significant degradation at very few steps (< 4-NFE). Meanwhile, integrating adversarial training into distillation (e.g., DMD/DMD2 and SANA-Sprint) to enhance performance introduces training instability, added complexity, and high GPU memory overhead due to the auxiliary trained models. To this end, we propose TwinFlow, a simple yet effective framework for training 1-step generative models that bypasses the need of fixed pretrained teacher models and avoids standard adversarial networks during training, making it ideal for building large-scale, efficient models. On text-to-image tasks, our method achieves a GenEval score of 0.83 in 1-NFE, outperforming strong baselines like SANA-Sprint (a GAN loss-based framework) and RCGM (a consistency-based framework). Notably, we demonstrate the scalability of TwinFlow by full-parameter training on Qwen-Image-20B and transform it into an efficient few-step generator. With just 1-NFE, our approach matches the performance of the original 100-NFE model on both the GenEval and DPG-Bench benchmarks, reducing computational cost by 100times with minor quality degradation. Project page is available at https://zhenglin-cheng.com/twinflow.

View arXiv page View PDF Project page GitHub 34 Add to collection

Community

kenshinn

Paper submitter 1 day ago

•

edited 1 day ago

Taming 20B full-parameter few-step training with self-adversarial flows! 👏🏻

One-model Simplicity: We eliminate the need for auxiliary networks (discriminators, teachers, fake score estimators...), everything in one model!
Scalability on Large Models: We transform Qwen-Image-20B into high-quality few-step generators by full-parameter training (Optimized for human figure generation!).

Checkout our 2-NFE images generated by our TwinFlow-Qwen-Image! 👇