I spent $638 on AI coding agents in 6 weeks.

1 points | by nthypes 7 hours ago

8 comments

nthypes 6 hours ago
I've asked Sonnet 4.5 (Cursor) to generate some stats based on my cursor CSV file usage. Here it is the output:
Key Findings from Your Cursor Usage Here are the most interesting insights from your usage data:
Cost Summary
Total Cost: $928.45 over 70 days
Average per Request: $0.06
Cost per Task (Request): Ranges from $0.00 to $2.78, with 65.7% costing under $0.05
Projected Monthly Cost: ~$416 (based on average daily spend of $13.86)
Request Patterns
Requests per 5 Hours: Average 70.7, ranging from 1 to 451
Average Time Between Requests: 6 minutes 33 seconds
Median Time Between Requests: Just 13 seconds (shows bursts of activity)
Peak Activity: 1-2 PM (10.4% of all requests at 1 PM)
Busiest Day: Saturday with 21.7% of requests
Token Efficiency
Average Tokens per Request: 83,371 tokens
Median Tokens per Request: 38,342 tokens
Average Output per Request: 876 tokens
Cache Hit Rate: 88.8% (excellent! saves money)
Cost per 1,000 Tokens: $0.0009 (very efficient due to caching)
Cost per 1,000 Output Tokens: $0.14
Notable Stats
Most Expensive Request: $2.78 using 6.8M tokens (mostly cached)
Total Hours of Active Usage: 1,692 hours (~9 requests/hour)
Most Used Models: claude-4.5-sonnet-thinking, claude-3.5-sonnet, and others
Your cache hit rate of 88.8% is excellent and is saving you significant costs! Without caching, your costs would be much higher.
Rochus 6 hours ago
And was it worth it? What were you able to achieve with the LLM-support you payed for? How much time did you save?
I did experiments with Claude Sonnet and Opus and also with Gpt-5, the latter via my Perplexity subscription. My experience with Claude was mixed; most output required significant re-work which mostly consumed the time savings enabled by Claude. In contrast, Gpt-5 was able to generate code for several complex problems and made different transpilations for me, and the generated code usually compiles and correctly runs up front, all covered by my Perplexity subscription.
[-]
- nthypes 6 hours ago
  For me it's about the value-to-token ratio feeling off, as I stated in another comment.
  > What bugs me most is that many of those 340M tokens feel wasteful? Like the LLM will use 50k tokens exploring dead ends before finding a solution that could have been expressed in 5k tokens. The productivity gain is real, but it feels like I'm paying 10x more than what should be "fair" for the actual value delivered.
Woods369 7 hours ago
I totally agree with the closing questions and statements here.
Yes, there is an apparent fluctuation in pricing for said tokens/credits/etc.. and they are just finding the sweet spot.
I primarily use the https://www.warp.dev/ terminal and they have just released a new plan which outwardly states lower cost per request.
They mentioned 'At full usage, the plans didn’t scale sustainably'. This is encouraging to know.
Right now, I'm looking around and trying out. Nice read, thanks.
mnky9800n 7 hours ago
I just use claude code max (125/month flat) with no problems like this. I constantly have multiple agents doing things on clusters figuring out problems, writing codes, or otherwise doing research for me (I am a scientist). Maybe you should swich from loser Cursor to winner Claude? Haha. And anyways, you are a founder, is 638$/6weeks affordable in your budget? It is up to you to decide what you think is a reasonable expense on your adventure towards profitability.
[-]
- nthypes 7 hours ago
  Thanks for the input! I'm checking on Claude Code Max now - from what I'm seeing, even the $200/month plan has weekly rate limits (240-480 hours of Sonnet 4, 24-40 hours of Opus 4 per week).. so not quite unlimited tokens either, though definitely more predictable billing.
  $638/6 weeks won't make me broke, but here's my main issue: for me it's about the value-to-token ratio feeling off.
  What bugs me most is that many of those 340M tokens feel wasteful? Like the LLM will use 50k tokens exploring dead ends before finding a solution that could have been expressed in 5k tokens. The productivity gain is real, but it feels like I'm paying 10x more than what should be "fair" for the actual value delivered.
  Maybe this is just the current state of AI coding - the models need that exploration space to get to the answer. Or maybe I need to get better at constraining the context and being more surgical with my prompts.
  For me as a founder, it's less "can I afford this" and more "does this pricing model make sense long-term?" If AI coding becomes a $5-6k/year baseline expense per developer, that changes a lot of unit economics, especially for early-stage companies.
  Are you finding Claude Code Max more token-efficient for similar tasks, or is it just easier to stomach because the billing is flat?
  [-]
  - mnky9800n 4 hours ago
    i think when you are testing out ideas you cannot also be thinking about how efficient that is. it doesn't make a lot of sense unless the problem you are trying to solve is efficiency. So like, I get your point, but I don't think anyone is wasting tokens, the LLM explores different solutions and arrives at ones that work. You seem to not want to pay for the tokens used on bad solutions, but they were useful to find the actual solutions. I also think that there are plenty of software licenses that we pay for in my work that are multiple times 5-6k/year and yet all our software is much cheaper than the salaries that cover our developers. Good developer tools are always worth it imo.
6510 7 hours ago
Also interesting is what it will cost if you do or don't train yourself to use it. You can't be a very good spell caster if you don't practice as much as the next wizard. Also, from now on everything you do will be this expensive :P