DiTFuse: Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach (Official Weights)
This repository provides the official pretrained weights for DiTFuse. The project code is available on GitHub:
๐ GitHub: https://github.com/Henry-Lee-real/DiTFuse
DiTFuse supports multiple fusion tasksโincluding infraredโvisible fusion, multi-focus fusion, multi-exposure fusion, and instruction-driven controllable fusion / segmentationโall within a single unified model.
๐ Available Model Versions
๐น V1 โ Stronger Zero-Shot Generalization
- Designed with better zero-shot fusion capability.
- Performs robustly on unseen fusion scenarios.
- Recommended if your use case emphasizes cross-dataset generalization.
๐น V2 โ Full Capability Version (Paper Model)
This is the main model used in the DiTFuse paper.
Provides the most comprehensive capabilities:
- Full instruction-following control
- Joint fusion + segmentation
- Better fidelity and controllability
- Stronger alignment with text prompts
Recommended for research reproduction, benchmarking, and controllable image fusion tasks.