--- pipeline_tag: image-to-image library_name: pytorch license: cc-by-nc-sa-4.0 tags: - image-restoration - vision-transformer - convnext-v2 datasets: - ModelMoe/DDRD20K --- # DDRM: Diffusion Degradation Restoration Model / 扩散退化还原模型 ## Model Description / 模型简介 **DDRM** (Diffusion Degradation Restoration Model) is a deep learning framework engineered for high-precision image restoration and color alignment. By leveraging **Reference-based Color Querying**, the model achieves accurate restoration of complex degraded images based on external stylistic cues. **DDRM** 是一款专为高精度图像修复与色彩对齐设计的深度学习模型。该模型通过**参考图颜色查询**机制,实现了对复杂退化图像的精准还原与色调同步。 --- ## 🛠️ Model Features / 模型架构特性 ### 1. Reference-based Color Query / 交互式参考图颜色查询 * **Cross-Image Semantic Alignment:** Extracts high-quality color priors from reference images. / **跨图像语义对齐**:从参考图中提取高质量的色彩先验信息。 * **Learnable Query Mechanism:** Utilizes 100 learnable query vectors to "search" the reference feature space, automatically selecting the optimal color and lighting parameters for the target scene. / **查询向量机制**:利用 100 个可学习的查询向量在参考图特征空间中进行“搜索”,自动拾取最契合当前场景的色彩与光影参数。 ### 2. KDE-based Distribution Prior / 动态核密度估计先验 * **Statistical Characteristic Capture:** Built-in Kernel Density Estimation (KDE) captures real-time image statistical properties. / **统计特性捕捉**:内置核密度估计模块,实时捕捉图像的统计学特征。 * **Global Color Anchors:** Instead of relying solely on pixel values, the model analyzes the PDF (Probability Density Function) to ensure the restored image maintains global tonal consistency with the reference. / **全局色彩锚点**:模型不再仅仅依赖像素值,而是通过分析概率密度函数 (PDF) 特征,确保还原图像在全局色调上与参考图保持统计学一致。 ### 3. Multi-Scale Transformer Decoder / 多尺度 Transformer 解码架构 * **Hybrid Architecture:** Combines the local fidelity of **ConvNeXt V2** with the global receptive field of **Transformers**. / **混合架构**:结合了 ConvNeXt V2 的局部保真度与 Transformer 的全局感受野。 - **Decoupled Representation Learning:** The encoder extracts structure and texture, while the multi-scale Transformer decoder injects color information guided by the Query mechanism. / **解耦表征学习**:编码器负责提取结构与纹理,而多尺度 Transformer 解码器负责注入由 Query 引导的色彩信息。 --- ## 🚀 Getting Started / 环境依赖 ```bash pip install torch kornia timm opencv-python