Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs

213 points | by philipkiely 12 hours ago

140 comments