mistralai
/

Ministral-3-3B-Instruct-2512

Model card Files Files and versions

patrickvonplaten commited on 9 days ago

Commit

1bca1cb

·

verified ·

1 Parent(s): 7e1b52c

Update README.md

Files changed (1) hide show

README.md +7 -6

README.md CHANGED Viewed

@@ -454,7 +454,7 @@ print(assistant_message)
 You can also use Ministral 3 3B Instruct 2512 with `Transformers` !
-Transformers very recently added preliminary support for FP8, so please make sure to install from main:
 ```sh
 uv pip install git+https://github.com/huggingface/transformers
@@ -469,10 +469,11 @@ pip install mistral-common --upgrade
 Try it out by running the following snippet.
 > [!Tip]
-> By default Transformers will load the checkpoint in FP8 and dequantize it to BF16 on the fly,
-> which means the model currently does not make use of accelerated FP8-kernels.
-> Compatibility with accelerated FP8-kernels is currently worked on and will be available in a couple of weeks.
-> Stay tuned!
 <details>
   <summary>Python snippet</summary>
@@ -517,7 +518,7 @@ decoded_output = tokenizer.decode(output[len(tokenized["input_ids"][0]):])
 print(decoded_output)
 ```
-**Note:**
 Transformers allows you to automatically convert the checkpoint to Bfloat16. To do so, simply load the model as follows:

 You can also use Ministral 3 3B Instruct 2512 with `Transformers` !
+Transformers recently added support for FP8, so make sure to install from main:
 ```sh
 uv pip install git+https://github.com/huggingface/transformers
 Try it out by running the following snippet.
 > [!Tip]
+> On latest main as of 05/12/2025, by default
+> a FP8 triton kernel for fast accelerated matmuls
+> (`w8a8_block_fp8_matmul_triton`) will be used
+> without any degradation in accuracy. However, if you want to
+> run your model in BF16 see (#transformers-bf16)
 <details>
   <summary>Python snippet</summary>
 print(decoded_output)
 ```
+#### Transformers BF16
 Transformers allows you to automatically convert the checkpoint to Bfloat16. To do so, simply load the model as follows: