Regarding the calculation of VRAM requirements for deploying meta-llama/Llama-4-Maverick-17B-128E-Instruct

#45
by a58982284 - opened

Regarding the calculation of VRAM requirements for deploying meta-llama/Llama-4-Maverick-17B-128E-Instruct, our company is procuring hardware for the deployment of this model. I would like to inquire: how much VRAM is needed if we deploy it via the vllm method, using bf16 precision and supporting a 1M token context window?

Sign up or log in to comment