1 points | by birdculture 15 hours ago
1 comments
Title: Serve an interactive language model app with latency-optimized TensorRT-LLM
Title: Serve an interactive language model app with latency-optimized TensorRT-LLM