--- license: apache-2.0 library_name: transformers tags: - qwen - vision-language - awq - int4 - vllm base_model: Qwen/Qwen3-VL-8B-Instruct --- code taken from : https://github.com/vllm-project/llm-compressor/blob/main/examples/awq/qwen3-vl-30b-a3b-Instruct-example.py # Qwen3-VL-8B-Instruct-AWQ AWQ (W4A16) quantized version of `Qwen/Qwen3-VL-8B-Instruct`. - **Quantization:** AWQ, 4 bits, group_size=128, zero_point=true, version="gemm" - **modules_to_not_convert:** ["visual"] - Prepared with LLM Compressor oneshot AWQ. recipe = AWQModifier( targets="Linear", scheme="W4A16", ignore=[r"re:model.visual.*", r"re:visual.*"], # drop lm_head from ignore duo_scaling=True, )