Regarding the calculation of VRAM requirements for deploying meta-llama/Llama-4-Maverick-17B-128E-Instruct

#45

by a58982284 - opened Nov 4

Nov 4

Regarding the calculation of VRAM requirements for deploying meta-llama/Llama-4-Maverick-17B-128E-Instruct, our company is procuring hardware for the deployment of this model. I would like to inquire: how much VRAM is needed if we deploy it via the vllm method, using bf16 precision and supporting a 1M token context window?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment