--- license: other license_name: flux-1-dev-non-commercial-license license_link: LICENSE.md base_model: - black-forest-labs/FLUX.1-dev base_model_relation: quantized library_name: diffusers tags: - sdnq - flux - 4-bit --- 4 bit (UINT4 with SVD rank 32) quantization of [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) using [SDNQ](https://github.com/Disty0/sdnq). Usage: ``` pip install sdnq ``` ```py import torch import diffusers from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers from sdnq.common import use_torch_compile as triton_is_available from sdnq.loader import apply_sdnq_options_to_model pipe = diffusers.FluxPipeline.from_pretrained("Disty0/FLUX.1-dev-SDNQ-uint4-svd-r32", torch_dtype=torch.bfloat16) # Enable INT8 MatMul for AMD, Intel ARC and Nvidia GPUs: if triton_is_available and (torch.cuda.is_available() or torch.xpu.is_available()): pipe.transformer = apply_sdnq_options_to_model(pipe.transformer, use_quantized_matmul=True) pipe.text_encoder_2 = apply_sdnq_options_to_model(pipe.text_encoder_2, use_quantized_matmul=True) pipe.transformer = torch.compile(pipe.transformer) # optional for faster speeds pipe.enable_model_cpu_offload() prompt = "A cat holding a sign that says hello world" image = pipe( prompt, height=1024, width=1024, guidance_scale=3.5, num_inference_steps=50, max_sequence_length=512, generator=torch.manual_seed(0) ).images[0] image.save("flux-dev-sdnq-uint4-svd-r32.png") ``` Original BF16 vs SDNQ quantization comparison: | Quantization | Model Size | Visualization | | --- | --- | --- | | Original BF16 | 23.8 GB | ![Original BF16](https://cdn-uploads.huggingface.co/production/uploads/6456af6195082f722d178522/t5VcrUq22x3Nqf0_n9ykZ.png) | | SDNQ UINT4 | 6.8 GB | ![SDNQ UINT4](https://cdn-uploads.huggingface.co/production/uploads/6456af6195082f722d178522/31Q9ULrybsJPxlUziTetF.png) |