HN Offline: Spending Too Much Money on a Coding Agent

Spending Too Much Money on a Coding Agent

GavinAnderegg | 155 points | 14day ago | allenpike.com

iamleppert|12day ago

"Now we don't need to hire a founding engineer! Yippee!" I wonder all these people who are building companies that are built on prompts (not even a person) from other companies. The minute there is a rug pull (and there WILL be one), what are you going to do? You'll be in even worse shape because in this case there won't be someone who can help you figure out your next move, there won't be an old team, there will just be NO team. Is this the future?

dotnet00|12day ago

Probably similar to the guy who was gloating on Twitter about building a service with vibe coding and without any programming knowledge around the peak of the vibe coding madness.

Only for people to start screwing around with his database and API keys because the generated code just stuck the keys into the Javascript and he didn't even have enough of a technical background to know that was something to watch out for.

IIRC he resorted to complaining about bullying and just shut it all down.

apwell23|12day ago

> around the peak of the vibe coding madness.

I thought we are currently in it now ?

RexySaxMan|12day ago

Yeah, I kind of doubt we've hit the peak yet.

dotnet00|12day ago

I don't actually hear people call it vibe coding as much as I did back in late 2024/early 2025.

Sure there are many more people building slop with AI now, but I meant the peak of "vibe coding" being parroted around everywhere.

I feel like reality is starting to sink in a little by now as the proponents of vibe coding see that all the companies telling them that programming as a career is going to be over in just a handful of years, aren't actually cutting back on hiring. Either that or my social media has decided to hide the vibe coding discourse from me.

euazOn|12day ago

The Karpathy tweet came out 2025-02-02. https://x.com/karpathy/status/1886192184808149383

dotnet00|12day ago

...my perception of time is screwed... it feels like it's been longer than that...

oc1|11day ago

all our perception of time seems messed up. claude code came out like 4 months ago and it feels like we had been using this thing for the past years. it feels like every week there is a new breakthrough in ai. it has never been more soul draining than now to be in tech just to keep up to be employable. is this what internet revolution felt like in the early 90s?

rufus_foreman|12day ago

>> back in late 2024/early 2025

As an old man, this is hilarious.

DonHopkins|12day ago

We can't bust code like we used to, but we have our ways.

One trick is to write goto statements that don't go anywhere.

So I ran a bourn shell in my emacs, which was the style at the time.

Now just to build the source code cost an hour, and in those days, timesheets had hours on them.

Take my five hours for $20, we'd say.

They didn't have blue checkmarks, so instead of tweeting, we'd just finger each other.

The important thing was that I ran a bourn shell in my emacs, which was the style at the time...

In those days, we used to call it jiggle coding.

unshavedyak|12day ago

Honestly i'm less scared of claude doing something like that, and more scared of it just bypassing difficult behavior. Ie if you chose a particularly challenging feature and it decided to give up, it'll just do things like `isAdmin(user) { /* too difficult to implement currently */ true }`. At least if it put a panic or something it would be an acceptable todo, but woof - i've had it try and bypass quite a few complex scenarios with silently failing code.

alwillis|12day ago

Sounds like a prompting/context problem, not a problem with the model.

First, use Claude's plan mode, which generates a step-by-step plan that you have to approve. One tip I've seen mentioned in videos by developers: plan mode is where you want to increase to "ultrathink" or use Opus.

Once the plan is developed, you can use Sonnet to execute the plan. If you do proper planning, you won't need to worry about Claude skipping things.

unshavedyak|11day ago

I wish there was a /model setting to use opus/ultrathink for planning, but sonnet for non planning or something.

It's a bit annoying having to swap back and forth tbh.

I also find planning to be a bit vague, where as i feel like sonnet benefits from more explicit instructions. Perhaps i should push it to reduce the scope of the plan until it's detailed enough to be sane, will give it a try

WXLCKNO|12day ago

This is by far the most crazy how thing I look out for with Claude Code in particular.

> Tries to fix some tests for a while > Fails and just .skip the test

Paradigma11|12day ago

Oh, but it will fix the test if you are not careful.

marcosscriven|12day ago

What service was this?

dotnet00|12day ago

Looks like I misremembered the shutting down bit, but it was this guy: https://twitter.com/leojr94_/status/1901560276488511759

Seems like he's still going on about being able to replicate billion dollar companies' work quickly with AI, but at least he seems a little more aware that technical understanding is still important.

ARandumGuy|12day ago

Any cost/benefit analysis of whether to use AI has to factor in the fact that AI companies aren't even close to making a profit, and are primarily funded by investment money. At some point, either the cost to operate these AI models needs to go down, or the prices will go up. And from my perspective, the latter seems a lot more likely.

immibis|11day ago

Not really. If they're running at a loss, their loss is your gain. Business is much more short-term than developers imagine it to be for some reason. You don't have to always use an infinitely sustainable strategy - you can change strategies once the more profitable unsustainable strategy stops sustaining.

v5v3|12day ago

They are not making money as they are all competing to push the models further and this R&D spending on salaries and cloud/hardware costs.

Unless models get better people are not going to pay more.

xianshou|12day ago

Rug pulls from foundation labs are one thing, and I agree with the dangers of relying on future breakthroughs, but the open-source state of the art is already pretty amazing. Given the broad availability of open-weight models within under 6 months of SotA (DeepSeek, Qwen, previously Llama) and strong open-source tooling such as Roo and Codex, why would you expect AI-driven engineering to regress to a worse state than what we have today? If every AI company vanished tomorrow, we'd still have powerful automation and years of efficiency gains left from consolidation of tools and standards, all runnable on a single MacBook.

fhd2|12day ago

The problem is the knowledge encoded in the models. It's already pretty hit and miss, hooking up a search engine (or getting human content into the context some other way, e.g. copy pasting relevant StackOverflow answers) makes all the difference.

If people stop bothering to ask and answer questions online, where will the information come from?

Logically speaking, if there's going to be a continuous need for shared Q&A (which I presume), there will be mechanisms for that. So I don't really disagree with you. It's just that having the model just isn't enough, a lot of the time. And even if this sorts itself out eventually, we might be in for some memorable times in-between two good states.

ChuckMcM|12day ago

Excellent discussion in this thread, captures a lot of the challenges. I don't think we're a peak vibe coding yet, nor have companies experienced the level of pain that is possible here.

The biggest 'rug pull' here is that the coding agent company raises there price and kills you're budget for "development."

I think a lot of MBA types would benefit from taking a long look at how they "blew up" IT and switched to IaaS / Cloud and then suddenly found their business model turned upside down when the providers decided to up their 'cut'. It's a double whammy, the subsidized IT costs to gain traction, the loss of IT jobs because of the transition, leading to to fewer and fewer IT employees, then when the switch comes there is a huge cost wall if you try to revert to the 'previous way' of doing it, even if your costs of doing it that way would today would be cheaper than the what the service provider is now charging you.

KronisLV|11day ago

> The biggest 'rug pull' here is that the coding agent company raises there price and kills you're budget for "development."

Spending a bunch of money on GPUs and running them yourself, as well as using tools that are compatible with Ollama/OpenAI type APIs feels like a safe bet.

Though having seen the GPU prices to get enough memory to run anything decent, I feel like the squeeze is already happening there at a hardware level and options like Intel Arc Pro B60 can't come soon enough!

ChuckMcM|11day ago

I don't disagree with this. When running the infrastructure for the Blekko search engine we did the math and after 115 servers worth of cluster it was always cheaper to do it ourselves than with AWS or elsewhere, than after around 1300 servers it is always cheaper to do it on your own space. (where you're paying for the facilities). It was an interesting way to reverse-engineer the colo business model :-)

KronisLV|11day ago

> "Now we don't need to hire a founding engineer! Yippee!"

This feels like a bit of a leap?

That's like saying "I just bought the JetBrains IDE Ultimate pack and some other really cool tools, so we no longer need a founding engineer!" All of that AI stuff can just be a force multiplier and most attempts at outright replacing people with them are a bit shortsighted. Closer to a temporary and somewhat inconsistent freelance worker, if anything.

That said, not wanting to pay for AI tools if they indeed help in your circumstances would also be like saying "What do you need JetBrains IDEs for, Visual Studio Code is good enough!" (and sometimes it is, so even that analogy is context dependent)

I'm reminded of rule 9 of the Joel Test: https://www.joelonsoftware.com/2000/08/09/the-joel-test-12-s...

hluska|12day ago

It get even darker - I was around in the 1990s and a lot of people who ran head on into that generation’s problems used those lessons to build huge startups in the 2000s. If we have outsourced a lot of learning, what do we do when we fail? Or how we compound on success?

pshirshov|12day ago

That's why I stick to what I can run locally. Though for most of my tasks there is no big difference between cloud models and local ones, in half the cases both produce junk but both are good enough for some mechanical transformations and as a reference book.

jbentley1|12day ago

My Claude Code usage would have been $24k last month if I didn't have a max plan, at least according to Claude-Monitor.

I've been using a tool I developed (https://github.com/stravu/crystal) to run several sessions in parallel. Sometimes I will run the same prompt multiple times and pick the winner, or sometimes I'll be working on multiple features at once, reviewing and testing one while waiting on the others.

Basically, with the right tooling you can burn tokens incredibly fast while still receiving a ton of value from them.

RobinL|12day ago

This is why unlimited plans are always revoked eventually - a small fraction of users can be responsible for huge costs (Amazon's unlimited file backup service is another good example). Also whilst in general I don't think there's much to worry about with AI energy use, burning $24k of tokens must surely be responsible for a pretty large amount of energy

grafmax|12day ago

> I don't think there's much to worry about with AI energy use

AI is a large motivating factor in data center build outs, and data centers are projected to form an increasing portion of new energy usage. An individual query may not use much but the macro effect is quite serious, especially considering the climate crisis we are already failing to manage. It’s a bit like throwing plastic out your window on the highway and ignoring the garbage patch floating in the middle of the Pacific.

octo888|12day ago

It's possible they have 1,200 users paying $20/120 paying $200 barely even using it.

spacecadet|12day ago

70,000,000 just last week ;P

But based on my costs, yours sounds much much higher :)

BiteCode_dev|12day ago

There is no way those companies don't loose ton of money on max plans.

I use and abuse mine, running multiple agents, and I know that I'd spend the entire month of fees in a few days otherwise.

So it seems like a ploy to improve their product and capture the market, like usual with startups that hope for a winner-takes-all.

And then, like uber or airbnb, the bait and switch will raise the prices eventually.

I'm wondering when the hammer will fall.

But meanwhile, let's enjoy the free buffet.

mccoyb|12day ago

Looked at your tool several times, but haven't answered this question for myself: does this tool fundamentally use the Anthropic API (not the normal MAX billing)? Presuming you built around the SDK -- haven't figured out if it is possible to use the SDK, but use the normal account billing (instead of hitting the API).

Love the idea by the way! We do need new IDE features which are centered around switching between Git worktrees and managing multiple active agents per worktree.

Edit: oh, do you invoke normal CC within your tool to avoid this issue and then post-process?

Jonovono|12day ago

Claude code has an SDK, where you specify the path to the CC executable. So I believe thats how this works. Once you have set up claude code in your environment and authed with however you like, this will just use that executable in a new UI

mccoyb|12day ago

Interesting, the docs for auth don't mention it: https://docs.anthropic.com/en/docs/claude-code/sdk#authentic...

Surprised that this works, but useful if true.

Jonovono|12day ago

https://docs.anthropic.com/en/docs/claude-code/sdk#typescrip...

`pathToClaudeCodeExecutable`!

mccoyb|12day ago

Thanks for showing!

unshavedyak|12day ago

Max $100 or $200?

I'm on $100 and i'm shocked how much usage i get out of Sonnet, while Opus feels like no usage at all. I barely even bother with Opus since most things i want to do just runout super quick.

borgel|12day ago

Interesting, I'm fairly new to using these tools and am starting with Claude Code but at the $20 level. Do you have any advice for when I would benefit from stepping up to $100? I'm not sure what gets better (besides higher usage limits).

unshavedyak|12day ago

No clue as i've not used Claude Code on Pro to get an idea of usage limits. But, if you get value out of Claude Code and ever run into limits, Max is quite generous for Sonnet imo. I have zero concern about Sonnet usage atm, so it's definitely valuable there.

Usage for Opus is my only "complaint", but i've used it so little i don't even know if it's that much better than Sonnet. As it is, even with more generous Opus limits i'd probably want a more advanced Claude Code behavior - where it uses Opus to plan and orchestrate, and Sonnet would do the grunt work for cheaper tokens. But i'm not aware of that as a feature atm.

Regardless, i'm quite pleased with Claude Code on $100 Max. If it was a bit smarter i might even upgrade to $200, but atm it's too dumb to give it more autonomy and that's what i'd need for $200. Opus might be good enough there, but $100 Opus limits are so low i've not even gotten enough experience with it to know if it's good enough for $200

vlade11115|12day ago

I recently switched from Pro to $100 Max, and the only difference I've found so far is higher usage limits. Antropic tends to give shiny new features to Max users first, but as of now, there is nothing Max-only. For me, it's a good deal nonetheless, as even $100 Max limits are huge. While on Pro, I hit the limits each day that I used Claude Code. Now I rarely see the warning, but I never actually hit the limit.

v5v3|12day ago

>My Claude Code usage would have been $24k last month if I didn't have a max plan, at least according to Claude-Monitor.

In their dreams.

qwertox|12day ago

Does Claude Max allow you to use 3rd-party tools with an API key?