Tokasaurus: An LLM inference engine for high-throughput workloads

209 points | by rsehrlich a day ago

23 comments