M
MLQuiz
Practice
Questions
Learn
Blog
Log in
Sign up
☰
Loading...
On a 24 GB VRAM GPU with a 7B FP16 model (14 GB weights) and each user requiring 800 MB KV cache, how many users can be served before OOM crash? | MLQuiz