vLLM large scale serving: DeepSeek 2.2k tok/s/h200 with wide-ep

117 points | by robertnishihara 21 hours ago

38 comments