Papers
arxiv:2512.05150

TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows

Published on Dec 3
ยท Submitted by Zhenglin Cheng (SII) on Dec 8
#1 Paper of the day
Authors:
,
,
,

Abstract

TwinFlow is a 1-step generative model framework that enhances inference efficiency without requiring fixed pretrained teacher models or standard adversarial networks, achieving high performance on text-to-image tasks and scaling efficiently.

AI-generated summary

Recent advances in large multi-modal generative models have demonstrated impressive capabilities in multi-modal generation, including image and video generation. These models are typically built upon multi-step frameworks like diffusion and flow matching, which inherently limits their inference efficiency (requiring 40-100 Number of Function Evaluations (NFEs)). While various few-step methods aim to accelerate the inference, existing solutions have clear limitations. Prominent distillation-based methods, such as progressive and consistency distillation, either require an iterative distillation procedure or show significant degradation at very few steps (< 4-NFE). Meanwhile, integrating adversarial training into distillation (e.g., DMD/DMD2 and SANA-Sprint) to enhance performance introduces training instability, added complexity, and high GPU memory overhead due to the auxiliary trained models. To this end, we propose TwinFlow, a simple yet effective framework for training 1-step generative models that bypasses the need of fixed pretrained teacher models and avoids standard adversarial networks during training, making it ideal for building large-scale, efficient models. On text-to-image tasks, our method achieves a GenEval score of 0.83 in 1-NFE, outperforming strong baselines like SANA-Sprint (a GAN loss-based framework) and RCGM (a consistency-based framework). Notably, we demonstrate the scalability of TwinFlow by full-parameter training on Qwen-Image-20B and transform it into an efficient few-step generator. With just 1-NFE, our approach matches the performance of the original 100-NFE model on both the GenEval and DPG-Bench benchmarks, reducing computational cost by 100times with minor quality degradation. Project page is available at https://zhenglin-cheng.com/twinflow.

Community

Paper submitter
โ€ข
edited 1 day ago

Taming 20B full-parameter few-step training with self-adversarial flows! ๐Ÿ‘๐Ÿป

  • One-model Simplicity: We eliminate the need for auxiliary networks (discriminators, teachers, fake score estimators...), everything in one model!
  • Scalability on Large Models: We transform Qwen-Image-20B into high-quality few-step generators by full-parameter training (Optimized for human figure generation!).

Checkout our 2-NFE images generated by our TwinFlow-Qwen-Image! ๐Ÿ‘‡

hybrid_grid_layout_v4

We are also working on Z-Image-Turbo, stay tuned!

very nice paper! ๐ŸŽ‰๐Ÿ‘

Hope it there will be one for **OnomaAIResearch/Illustrious-xl-early-release-v0 ** gonna save us from 24/29 sampling steps for every GEN ๐Ÿ‘€

Paper submitter
This comment has been hidden (marked as Resolved)

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

the github / huggingface link of the project are not found?

ยท

the github / huggingface link of the project are not found?

sorry for delay, we are releasing them soon.

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2512.05150 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2512.05150 in a Space README.md to link it from this page.

Collections including this paper 4