I'm the dev behind this. Started as a weekend project because I kept getting sticker shock from my OpenAI bills. I'd use GPT-4
for literally everything - even "fix this typo" type requests that cost 20x more than they should.
The breakthrough was realizing most requests don't actually need the expensive models. So I built quality detection that tries
the cheap model first, then upgrades only if the response is garbage.
Been using it in production for 3 months now. Went from ~$400/month to ~$120/month with zero changes to my actual prompts or
code. The quality detection catches about 15-20% of requests that need the premium models.
Works with both OpenAI and Anthropic - Claude Opus → Claude Haiku saves even more than the OpenAI routing since the price gap is
bigger.
Happy to answer any questions! The trickiest part was getting the quality scoring right - too aggressive and you get bad
responses, too conservative and you don't save money.
Also working on a team dashboard, but wanted to get the core SDK out there first since it's been so useful for me.
Hey everyone!
I'm the dev behind this. Started as a weekend project because I kept getting sticker shock from my OpenAI bills. I'd use GPT-4 for literally everything - even "fix this typo" type requests that cost 20x more than they should.
The breakthrough was realizing most requests don't actually need the expensive models. So I built quality detection that tries the cheap model first, then upgrades only if the response is garbage.
Been using it in production for 3 months now. Went from ~$400/month to ~$120/month with zero changes to my actual prompts or code. The quality detection catches about 15-20% of requests that need the premium models.
Works with both OpenAI and Anthropic - Claude Opus → Claude Haiku saves even more than the OpenAI routing since the price gap is bigger.
Happy to answer any questions! The trickiest part was getting the quality scoring right - too aggressive and you get bad responses, too conservative and you don't save money.
Also working on a team dashboard, but wanted to get the core SDK out there first since it's been so useful for me.