Regarding the calculation of VRAM requirements for deploying meta-llama/Llama-4-Maverick-17B-128E-Instruct
#45
by
a58982284
- opened
Regarding the calculation of VRAM requirements for deploying meta-llama/Llama-4-Maverick-17B-128E-Instruct, our company is procuring hardware for the deployment of this model. I would like to inquire: how much VRAM is needed if we deploy it via the vllm method, using bf16 precision and supporting a 1M token context window?