Simon Wilson (known for Django) has been doing a lot of LLM evangelism on his blog these days. Antirez (Redis) wrote a blog post recently with the same vibe.
I doubt they are not good programmers. They are probably better than most of us, and I doubt they feel insecure because of the LLMs. Either I'm wrong, or there's something more to this.
edit: to clarify, I'm not saying Simon and Antirez are part of the hostile LLM evangelists the article criticizes. Although the article does generalize to all LLM evangelists at least in some parts and Simon did react to this here. For these reasons, I haven't ruled him out as a target of this article, at least partly.
The author claims it's not just that one evangelizes it, but that they become hostile when someone claims to not have the same experience in response. I don't recall Either Willison or Antirez scaring people by saying they will be left behind or that they are just afraid of becoming irrelevant. Instead they just talk about their positive experiences using it. Willison and Antirez seem to be fine to live and let live (maybe Antirez a bit less, but they're still not mean about it).
My gut says that is not a property of LLM evangelists, but a property of current internet culture in general. People with strong, divisive, and engaging opinions seem to do well (by some definition of well) online.
It's weird how some people seem to treat using an LLM as part of their personality in a borderline cult like way. So someone saying they don't use it or don't find it useful triggers an anger response in them.
That is not novel - see language/framework choice, OS (or even distro) preferences, editor wars, indentation. People develop strong opinions about tools, technology, and techniques regardless of domain. LLM maximalists just have the unfortunate capability to generate infinite content about their specific shiny thing.
This. For every absurd LLM cheerleader, there’s a corresponding LLM minimalist who trots out the “stochastic parrot” line at every possible occasion along with the fact that they do CrossFit and don’t own a TV.
I think the actual problem is everyone tries to assert how capable or not coding agents currently are, but how useful they are depends so much on what you are trying to get them to do and also on your communication with the model. And often its hard to tell whether you're just prompting it wrong or if they're incapable of doing it.
By now we at least agree that stochastical parrots can be useful.
It would be nice if the debate now was less polarized so we could focus on what makes them work better for some and worse for others other than just expectations.
And yeah, as I laid out in the article (that of course, very few people actually read, even though it was short...), I really don't mind how people make code. It's those that try so hard to convince the rest of us I find very suspect.
In my case I don’t even mind if these evangelists try so hard to convince other developers. What I do mind is that they seem to be quite successful in convincing our bosses. So we get things like mandatory LLM usage, minimum number of Claude API calls per day, every commit must be co-authored by Claude, etc.
You seem to be mistaking set and members here. The piece's critique is against the set (LLM Evangelists), not against specific members of the set (the ones you mentioned). One can agree with the point of the piece while still acknowledging there are good programmers who are also LLM evangelists.
You are right, I went a bit too quickly so let me expand a bit my chain of thoughts.
The article is against the set of LLM evangelists who are hostile towards the skeptics.
I 100% agree with the part that basically says fuck you to them.
However, explaining the hostile part with there being the feeling of insecurity (which is plausible but would need evidence) is not fully convincing and it seems dangerous to accept this conclusion and stop looking for the actual reasons this quickly.
And the fact that there are actually good programmers persuaded that LLMs help them weakens the "insecurity" argument quite a bit, at least as the only explanation.
As someone currently pretty much hostile to LLMs, I'm quite interested in what's currently at play but I'm suspicious of claims that initially feel good but are not strongly backed.
Like, if these hostile people were actually shills, we would want to know this and not have closed the eyes too early because of some explanation that felt good, right? Or any actual reason.
That is the basic argument for the utility of stereotypes too. I think the author is engaged in projection. I think the hecklers are the ones revealing their insecurities, which are justified. Even if AI progress were frozen today, the market conditions are going to change in ways that are hard to predict and a little bit of programming knowledge is not going to be the massive arbitrage opportunity that it used to be.
Look into the Motte-and-bailey fallacy.
That set seriously doesn't exist. Even the people on youtube doing "vibe coding" benchmarks mostly say it's crap. (Well they probably exist on twitter/linkedin.)
This article just functions as flamebait for people who use LLMs to implement whole features to argue the semantics of "vibe coding". All while everyone is ignoring the writing on the wall. That we will soon have boxes going through billions of tokens every second. At that point slopcoding WILL be productive, but only if you build up the skill to differentiate yourself from the top 10% of prompters.
Simon just explores stuff and writes about them. Doesn't mean he uses LLMs for everyting.Antirez likes to question stuff and make them better. Doesn't mean he uses LLMs for everything.
Also their experience is not my experience. I will make my own choices.
I don't think it's fair to compare people who are already seemingly at the peak of their career, a place to which they got by building skill in coding. And in fact what they have now that's valuable isn't mostly skill but capital. They've built famous software that's widely used.
Also they didn't adopt the your-career-is-ruined-if-you-don't-get-on-board tone that is sickeningly pervasive on LinkedIn. If you believe that advice and give up on being someone who understands code, you sure aren't gonna write Redis or Django.
It's a tactical error to disagree with influencers in public, much less actually criticize them, since it only arouses the mob. When there's a power difference you have to cede the platform (since they already have it), and try to ask good questions.
Questions like.. did we really even need to invoke particular influencers to discuss this issue? Why does that come up at all, and why is it the top comment? If names and argument from authority can settle issues on HN now, does it work with all credentialed authorities, or only those vocal few with certain opinions?
I suspect for truly talented people, they just like talking to LLMs. And they're also not 100% focused on programming anymore, so the async nature of it matters more than it does to people who write code full time for a living.
Hearing people on tech twitter say that LLMs always produce better code than they do by hand was pretty enlightening for me.
LLMs can produce better code for languages and domains I’m not proficient in, at a much faster rate, but damn it’s rare I look at LLM output and don’t spot something I’d do measurably better.
These things are average text generation machines. Yes you can improve the output quality by writing a good prompt that activates the right weights, getting you higher quality output. But if you’re seeing output that is consistently better than what you produce by hand, you’re probably just below average at programming. And yes, it matters sometimes. Look at the number of software bugs we’re all subjected to.
And let’s not forget that code is a liability. Utilizing code that was “cheap” to generate has a cost, which I’m sure will be the subject of much conversation in the near future.
> These things are average text generation machines.
Funny... seems like about half of devs think AI writes good code, and half think it doesn't. When you consider that it is designed to replicate average output, that makes a lot of sense.
So, as insulting as OP's idea is, it would make sense that below-average devs are getting gains by using AI, and above-average devs aren't. In theory, this situation should raise the average output quality, but only if the training corpus isn't poisoned with AI output.
I have an anecdote that doesn't mean much on its own, but supports OP's thesis: there are two former coworkers in my linkedin feed who are heavy AI evangelists, and have drifted over the years from software engineering into senior business development roles at AI startups. Both of them are unquestionably in the top 5 worst coders I have ever worked with in 15 years, one of them having been fired for code quality and testing practices. Their coding ability, transition to less technical roles, and extremely vocal support for the power of vibe coding definitely would align with OP's uncharitable character evaluation.
After a certain experience level though, I think most of us get to the point of knowing what that difference in quality actually matters.
Some seniors love to bikeshed PRs all day because they can do it better but generally that activity has zero actual value. Sometimes it matters, often it doesn't.
Stop with the "I could do this better by hand" and ask "is it worth the extra 4 hours to do this by hand, or is this actually good enough to meet the goals?"
LLM generated code is technical debt. If you are still working on the codebase the next day it will bite you. It might be as simple as an inconvenient interface, a bunch of duplicated functions that could just be imported, but eventually you are going to have to pay it.
All code is technical debt though. We can't spend infinite hours finding the absolute minima of technical debt introduced for a change, so it is just finding the right balance. That balance is highly dependent on a huge amount of factors: how core is the system, what is the system used for, what stage of development is the system, etc.
Untested undocumented LLM code is technical debt, but if you do specs and tests it's actually the opposite, you can go beyond technical debt and regenerate your code as you like. You just need testing to be so good it guarantees the behavior you care about, and that is easier in our age of AI coding agents.
> but if you do specs and tests it's actually the opposite, you can go beyond technical debt and regenerate your code as you like.
Having to write all the specs and tests just right so you can regenerate the code until you get the desired output just sounds like an expensive version of the infinite monkey theorem, but with LLMs instead of monkeys.
A human SWE can use an LLM to refactor and reduce some of the debt just as easily too. I think fundamentally, the possible rate of new code and new technical debt introduced by LLMs is much higher than a human SWE. Left unchecked, a human still needs sleep and more humans can't be added with more compute.
There's an interesting aspect to the LLM debt being taken on though in that I'm sure some are taking it on now in the bet/hopes that further advancements in LLMs will make it more easily addressable in the future before it is a real problem.
So I can tell you don’t use these tools, or at least much, because at the speed of development with them you’ll be knee deep in tech debt in a day, not a month, but as a corollary can have the same agentic coding tools undergo the equivalent of weeks of addressing tech debt the next day. Well, I think this applies to greenfield AI-first oriented projects that work this way from the get go and with few humans in the loop (human to human communication definitely becomes the rate limiting step). But I imagine that’s not the nature of your work.
I think you missed the your parent post's phrase "in the specific areas _I_ work in" ... LLMs are a lot better at crud and boilerplate than novel hardware interfaces and a bunch of other domains.
But why would it take a month to generate significant tech debt in novel domains, it would accrue even faster then right? The main idea I wanted to get across is that iteration speed is much faster so what's "tech debt" in the first pass, can be addressed much faster in future passes, which will happen on the order of days rather than sprints in the older paradigm. Yes the first iterations will have a bunch of issues but if you keep your hands on the controller you can get things to a decent state quickly. I think one of the biggest gaps I see in devs using these tools is what they do after the first pass.
Also, even for novel domains, using tools like deep research and the ability of these tools to straight up search through the internet, including public repos during the planning phase (you should be planning first before implementing right? You're not just opening a window and asking in a few sentences for a vaguely defined final product I hope) is a huge level up.
If there are repos, papers, articles, etc of your novel domain out there, there's a path to a successful research -> plan -> implement -> iterate path out there imo, especially when you get better at giving the tools ways to evaluate their own results, rather than going back and forth yourself for hours telling them "no, this part is wrong, no now this part is wrong, etc etc"
I mean, there's also, "this looks fine but if I actually had written this code I would've naturally spent more time on it which would have led me to anticipate the future of this code just a little bit more and I will only feel that awkwardness when I come back to this code in two weeks, and then we'll do it all over again". It's a spectrum.
now sometimes that's 4 hours, but I've had plenty of times where I'm "racing" people using LLMs and I basically get the coding done before them. Once I debugged an issue before the robot was done `ls`-ing the codebase!
The shape of the problem is super important in considering the results here
You have the upper hand with familiarity of the code base. Any "domain expert" also necessarily has a head start knowing which parts of a bespoke complex system need adjustment when making changes.
On the other hand, a highly skilled worker who just joined the team won't have any of that tribal knowledge. There is a significant lag time getting ramped up, no matter how intelligent they are due to sheer scale (and complexity doesn't help).
A general purpose model is more like the latter than the former. It would be interesting to compare how a model fine tuned on the specific shape of your code base and problem domain performs.
I've been playing with vibe coding a lot lately and I think in most cases, the current SOTA LLM's don't produce code that I'd be satisfied with. I kind of feel like LLM's are really really good at hacking on a messy and fragile structure, because they can "keep track many things in their head"
BUT
An LLM can write a PNG decoder that works in whatever language I choose in one or a few shots. I can do that too, but it will take me longer than a minute!
(and I might learn something about the png format that might be useful later..)
Also, us engineers can talk about code quality all day, but does this really matter to non-engineers? Maybe objectively it does, but can we convince them that it does?
> Maybe objectively it does, but can we convince them that it does?
how long would you give our current civilisation if quality of software ceased to be important for:
- medical devices
- aircraft
- railway signalling systems
- engine management systems
- the financial system
- electrical grid
- water treatment
- and every other critical system
I kinda like Theo's take on it (https://www.youtube.com/watch?v=Z9UxjmNF7b0): there's a sliding scale of how much slop should reasonably be considered acceptable and engineers are well advised to think about it more seriously. I'm less sold on the potential benefits (since some of the examples he's given are things that I would also find easy by hand), but I agree with the general principle that having the option to do things in a super-sloppy way, combined with spending time developing intuition around having that access (and what could be accomplished that way), can produce positive feedback loops.
In short: when you produce the PNG decoder, and are satisfied with it, it's because you don't have a good reason to care about the code quality.
> Maybe objectively it does, but can we convince them that it does?
I strongly doubt it, and that's why articles like TFA project quite a bit of concern for the future. If non-engineers end up accepting results from a low-quality, not-quite-correct system, that's on them. If those results compromise credentials, corrupt databases etc., not so much.
I tried vibe coding a BMP decoder not too long ago with the rationale being “what’s simpler than BMP?”
What I got was an absolute mess that did not work at all. Perhaps this was because, in retrospect, BMP is not actually all that simple, a fact that I discovered when I did write a BMP decoder by hand. But I spent equal time vibe coding and real coding. At the end of the real coding session, I understood BMP, which I see as a benefit unto itself. This is perhaps a bit cynical but my hot take on vibe coders is that they place little value on understanding things.
Mind explaining the process you tried? As someone who’s generally not had any issue getting LLMs to sort out my side projects (ofc with my active involvement as well), I really wonder what people who report these results are trying. Did you just open a chat with claude code and try to get a single context window to one shot it?
Just out of curiousity (as someone fairly familiar with the BMP spec, and also PNG incidentally): what did you find to be the trickiest/most complex aspects?
None of this is fresh in my mind, so my recollections might be a little hazy. I think the only issue I personally had when writing a decoder was keeping the alignment of various fields right. I wrote the decoder in C# and if I remember correctly I tried to get fancy with some modern-ish deserialization code. I think I eventually resorted to writing a rather ugly but simple low-level byte reader. Nevertheless I found it to be a relatively straightforward program to write and I got most of what I wanted done in under a day.
The vibe coded version was a different story. For simplicity, I wanted to stick to an early version of BMP. I don’t remember the version off the top of my head. This was a simplified implementation for students to use and modify in a class setting. Sticking to early version BMPs also made it harder for students to go off-piste since random BMPs found on the internet probably would not work.
The main problem was that the LLM struggled to stick to a specific version of BMP. Some of those newer features (compression, color table, etc, if I recall correctly) have to be used in a coordinated way. The LLM made a real mess here, mixing and matching newer features with older ones. But I did not understand that this was the problem until I gave up and started writing things myself.
You should try a similar task with a recent model (Opus 4.5, GPT 5.2, etc) and see if there are any improvements since your last attempt. I also encourage using a coding agent. It can use test files that you provide, and perform compile-test loops to work out any mistakes.
It sounds like you used an older model, and perhaps copy-pasted code from a chat session. (Just guessing, based on what you described.)
Claude Code is way better than I am at rummaging through Git history, handling merge conflicts, renaming things, writing SQL queries whose syntax I always forget (window functions and the like). But yeah. If I give it a big, non-specific task, it generates a lot of mediocre (or worse) code.
That's funny that's all the things I don't trust it to do. I actually use it the other way around, give it a big non-specific task, see if it works, specify better, retry, throw away 60% - 90% of the generated code, fix bugs in a bunch of places and out comes an implemented feature.
I give the agent the following standing instructions:
"Make the smallest possible change. Do not refactor existing code unless I explicitly ask."
That directive cut down considerably on the amount of extra changes I had to review. When it gets it right, the changes are close to the right size now.
The agent still tries to do too much, typically suggesting three tangents for every interaction.
As someone who: 1.) Has a brain that is not wired to think like a computer and write code. 2.) Falls asleep at the keyboard while writing code for more than an hour or two. 3.) Has a lot of passion for sticking with an idea and making it happen, even if that means writing code and knowing the code is crap.
So, in short, LLMs write better code than I do. I'm not alone.
LLMs are not "average text generation machines" once they have context. LLMs learn a distribution.
The moment you start the prompt with "You are an interactive CLI tool that helps users with software engineering at the level of a veteran expert" you have biased the LLM such that the tokens it produces are from a very non-average part of the distribution it's modeling.
True, but nuanced. The model does not react to "you are an experienced programmer" kinds of prompts. It reacts to being given relevant information that needs to be reflected in the output.
See examples in https://arxiv.org/abs/2305.14688; They certainly do say things like "You are a physicist specialized in atomic structure ...", but the important point is that the rest of the "expert persona" prompt _calls attention to key details_ that improves the response. The hint about electromagnetic forces in the expert persona prompt is what tipped off the model to mention it in the output.
Bringing attention to key details is what makes this work. A great tip for anyone who wants to micromanage code with an LLM is to include precise details about what they wish to micromanage: say "store it in a hash map keyed by unsigned integers" instead of letting the model decide which data structure to use.
I worked for a relatively large company (around 400 employees there are programmers). The people who embraced LLM-generated code clearly share one trait: they are feature pushers who love to say "yes!" to management. You see, management is always right, and these programmers are always so eager to put their requirements, however incomplete, into a Copilot session and open a pull request as fast as possible.
The worst case I remember happened a few months ago when a staff (!) engineer gave a presentation about benchmarks they had done between Java and Kotlin concurrency tools and how to write concurrent code. There was a very large and strange difference in performance favoring Kotlin that didn't make sense. When I dug into their code, it was clear everything had been generated by a LLM (lots of comments with emojis, for example) and the Java code was just wrong.
The competent programmers I've seen there use LLMs to generate some shell scripts, small python automations or to explore ideas. Most of the time they are unimpressed by these tools.
It'd be rather surprising if you could train an AI on a bunch of average code, and somehow get code that's always above average. Where did the improvement come from?
We should feed the output code back in to get even better code.
AI generally can improve through reinforcement learning, but this requires it to be able to compare its output to some form of metric. There aren't a lot of people I'd trust to RLHF for code quality, and anything more automated than that is destined to collapse due to Goodhart's Law.
In my view they’re great for rough drafts, iterating on ideas, throwaway code, and pushing into areas I haven’t become proficient in yet. I think in a lot of cases they write ok enough tests.
How are you judging that you'd write "better" code? More maintainable? More efficient? Does it produce bugs in the underlying code it's generating? Genuinely curious where you see the current gaps.
- it adds superfluous logic that is assumed but isn’t necessary
- as a result the code is more complex, verbose, harder to follow
- it doesn’t quite match the domain because it makes a bunch of assumptions that aren’t true in this particular domain
They’re things that can often be missed in a first pass look at the code but end up adding a lot of accidental complexity that bites you later.
When reading an unfamiliar code base we tend to assume that a certain bit of logic is there for a good reason, and that helps you understand what the system is trying to do. With generative codebases we can’t really assume that anymore unless the code has been thoroughly audited/reviewed/rewritten, at which point I find it’s easier to just write the code myself.
This has been my experience as well. But, these are things we developers care about.
Coding aside, LLM's aren't very good at following nice practices in general unless explicitly prompted to. For example if you ask an LLM to create an error modal box from scratch, will it also implement the ability to select the text, or being able to ctrl c to copy the text, or perhaps a copy message button? Maybe this is a bad example, but they usually don't do things like this unless you explicitly ask them to. I don't personally care too much about this, but I think it's noteworthy in the context of lay people using LLM's to vibe code.
It's let me apply my general knowledge across domains, and do things in tech stacks or languages I don't know well. But that has also cost me hours debugging a solution I don't quite understand.
When working in my core stack though it's a nice force multiplier for routine changes.
> if you’re seeing output that is consistently better than what you produce by hand, you’re probably just below average at programming
even though this statement does not mathematically / statistically make sense - vast majority of SWEs are “below average.” therein lies the crux of this debate. I’ve been coding since the 90’s and:
- LLM output is better than mine from the 90’s
- LLM output is better than mine from early 2000’s
- LLM output is worse than any of mine from 2010 onward
- LLM output (in the right hands) is better than 90% of human-written code I have seen (and I’ve seen a lot)
The most prolific coders are also more competent than average. Their proliferations are what have trained these models. These models are trained on incredibly successful projects written by masters of their fields. This is usually where I find the most pushback is that the most competent SWEs see it as theft and also useless to them since they have already spend years honing skills to work relentlessly and efficiently towards solutions -- sometimes at great expense.
I'd assume most of the code visible on the web leans amateur. A huge portion of github repos seem to be from students these days. You'll see GitHub's Education page listing "5 million students" (https://github.com/education), which I assume is an under-estimate, as that's only the formal program.
> The most prolific coders are also more competent than average
This is absolutely not true lol, as anyone who's worked with a fabled 10X engineer will tell you. It's like saying the best civil engineer is the one that builds the most bridges.
I've worked with a 10x engineer and indeed they were significantly more competent than the rest of the team in their execution and design. They've seen so many problems and had a chance to discard bad patterns and adopt/try out new ones.
If firing up old coal plants and skyrocketing RAM prices and $5000 consumer GPUs and violating millions of developers' copyrights and occasionally coaxing someone into killing themselves is the cost of Brian Who Got Peter Principled Into Middle Management getting to enjoy programming again instead of blaming his kids for why he watches football and drinks all weekend instead of cultivating a hobby, I guess we have no choice but to oblige him his little treat.
> And then, inevitably, comes the character evaluation, which goes something like this:
I saw a version of this yesterday where a commenter framed LLM-skepticism as a disappointing lack of "hacker" drive and ethos that should be applied to making "AI" toolchains work.
As you might guess, I disagreed: The "hacker" is not driven just by novelty in problems to solve, but in wanting to understand them on more than a surface layer. Messing with kludgy things until they somehow work is always a part of software engineering... but the motive and payoff comes from knowing how things work, and perceiving how they could work better.
What I "fear" from LLMs-in-coding is that they will provide an unlimited flow of "mess around until it works" drudgery tasks with none of the upside. The human role will be hammering at problems which don't really have a "root cause" (except in a stochastic sense) and for which there is never any permanent or clever fix.
Would we say someone is "not really an artist" just because they don't want to spend their days reviewing generated photos for extra-fingers, circling them, and hitting the "redo" button?
We have a hard enough time finding juniors (hell, non-juniors) that know how to program and design effectively.
The industry jerking itself off over Leetcode practice already stunted the growth of many by having them focus on rote memorization and gaming interviews.
With ubiquitous AI and all of these “very smart people” pushing LLMs as an alternative to coding, I fear we’re heading into an era where people don’t understand how anything works and have never been pushed to find out.
Then again, the ability of LLMs to write boilerplate may be the reset that we need to cut out all of the people that never really had an interest in CS that have flocked to the industry over the last decade or so looking for an easy big paycheck.
> to cut out all of the people that never really had an interest in CS
I had assumed most of them had either filtered out at some stage (an early one being college intro CS classes), ended up employed somewhere that didn't seem to mind their output, or perpetually circle on LinkedIn as "Lemons" for their next prey/employer.
My gut feeling is that messy code-gen will increase their numbers rather than decrease them. LLMs make it easier to generate an illusion of constant progress, and the humans can attribute the good parts of the output to themselves, while blaming bad-parts on the LLM.
> What I "fear" from LLMs-in-coding is that they will provide an unlimited flow of "mess around until it works" drudgery tasks with none of the upside.
I feel like its very true to the hacker spirit to spend more time customizing your text editor than actually programming, so i guess this is just the natural extension.
Even when 100% issue-oriented (that is, spending no time on editor-customizatons or developing other skill and toolkits) consider the difference between:
1. This thing at work broke. Understand why it broke, and fix it in a way which stays and has preventative power. In the rare case where the cause is extremely shallow, like a typo, at least the fix is still reliable.
2. This thing at work broke. The LLM zigged when it should have zagged for no obvious reason. There is plausible-looking code that is wrong in a way that doesn't map to any human (mis-)understanding. Tweak it and hope for the best.
There’s plenty of understanding we need to get in order to learn to steer the agents precisely, rather than, as you put it, mess around until it works. Some people are actively working on it, while others make a point of looking the other way.
The anti-LLM side seems much more insecure. Pro-LLM influencers are sometimes corny, but it's sort of like any other influencer, they are incentivized to make everything sound exciting to get clicks. Nobody was complaining about 3d printer influencers raving about how printing replacement dishwasher parts was going to change everything.
LLMs have also become kind of a political issue, except only the "anti" side even really cares about it. Given that using and prompting them is very much a garbage in/garbage out scenario, people let their social and political biases cloud their usage, and instead of helping it succeed, they try to collect "gotcha" moments, which doesn't reflect the workflow of someone using an LLM productively.
I think the author slips into the same pattern he’s criticizing. He says LLM fans shouldn’t label skeptics as “afraid” then he turns around and labels the fans as “insecure” or “not very good at programming.”
It’s the same move; guessing what’s going on in someone’s head instead of sticking to what actually happened and what the tools can or can’t do.
The simpler truth is LLMs are great in some cases and painful in others. They shine on boilerplate and tests. They struggle when the domain is unusual, requirements are fuzzy; mistakes are made, you pay a big babysitting tax.
Instead of psychoanalyzing each other, people should share concrete examples
Back in the day, Python devs commonly were C programmers.
Someone had to do the implementation, after all. And the C API was (and still is) kind of a big deal.
There's a reason the standard library is full of direct ports of C libraries with unsightly, highly un-Pythonic names and APIs. (Of course, it's also full of direct ports of Java libraries with unsightly, highly un-Pythonic architecture.)
This is still a thing today. There have been multiple times I oneshot some project that leadership had been waiting on some team forever to finish, and 90% of it was them refusing to touch a "noob" lang like Python or JS.
Agreed, but do you honestly think LLMs have reached the level of average programmer? Or is it more a matter of "they can churn out code until I see something that is close enough and I'll make the last few edits"?
Also curious if you publish your working setup or if it changes as fast as the LLMs? Seems like you may have a more stable setup than most given how you are developing tools in the space.
I still do not see LLMs as replacements for programmers - they're tools for programmers to direct. If you don't know anything about programming you might be able to get a vibe coded prototype or simple tool out of them but that's a very different thing from a what happens when a skilled software developer uses these things to help accelerate their work.
My current setup is mainly Claude Code CLI on macOS and Claude Code for web driven by the iPhone all and macOS desktop app. I occasionally use Codex CLI too.
I expect I'll be on a different default combo of tools within a month or two.
Came here to comment on this line: it completely changes the tone of the article. It's fairly reasonable and neutral until we get here, upon which the antagonism is jarringly clear.
In fact I would posit this is the central crux of the post: OP does not believe those LLM evangelists were ever good programmers.
As others have already noted[1], many well-known excellent programmers - including yourself! and now even Linus! - would beg to differ.
Steam reached a new peak of 42 million concurrent players today [1]. An average/mid-tier gaming PC uses 0.2 kWh per hour [2]. 42 million * 0.2 gives 8,400,000 kWh per hour, or 8,400 MWh per hour.
By contrast, training GPT3 was estimated to have used 1,300 MWh of energy [3].
This does not account for training costs of newer models, nor inference costs. But we know inference costs are extraordinarily inexpensive and energy efficient [2]. The lowest estimate of energy cost for 1 hour of Steam's peak concurrent player count uses 6.5x more energy than all of the energy that went into training GPT3.
I see no point in making this a numbers game. (Like, I was supposed to say "five" or something?)
Let's make it more of a category thing: when AI shows itself responsible for a new category of life-saving technique, like a cure for cancer or Alzheimer's, then I'd have to reconsider.
(And even then, it will be balanced against rising sea levels, extinctions, and other energy use effects.)
Local LLMs that you can run on consumer hardware don't really do anything though. They are amusing, maybe you could use them for basic text search, but they don't have any real knowledge like the hosted ones do.
You need to be fairly smart to be in tech. People who grew up smart and were told they were tend to view it as part of their self worth. If someone disagrees with this person later on, their self with has been attacked so of course they are going to lash out.
The worst thing you can say to a dev is they are wrong. Most will do everything in their power to prove otherwise, even on the dumbest of topics.
Humans are tribal, which has both benefits and costs.
In technology, the historical benefits of evangelizing your favorite technology might just be that it becomes more popular and better supported.
Even though LLMs may or may not follow the same path, if you can get your fellow man on-board, then you'll have a shared frame of reference, and someone to talk through different usage scenarios.
Is there enough of new blood on HN? For me it was the best place, my favorite website, when I was entering startup scene. Loved it. I don't think a lot of young founders I know ever go here...
While this place has always been attractive to people building startups, back in the day (my original account is from 2009) "Hacker" News was much more about Hackers. Most people posting here had read "On Lisp", respected Paul Graham as a programmer and were enthusiastic about programming and solving problems above all else.
I'm honestly curious how many people that visit HN today even know what a "y combinator" is, and I have a pretty reasonable guess as to how many have implemented it for fun (though probably the applicative order version).
"LLM evangelists - are you willing to admit that you just might not be that good at programming computers?"
The people who were the best at something don't necessarily be the best at a new paradigm. Unlearning some principles and learning new ones might be painful exercise for some masters.
Military history has shown that the masters of the new wave are not necessarily the masters of the previous wave we see the rise and downfall of several civilizations from Roman to Greek for being too sure of their old methods and old military equipments and strategy.
I have been through this before (wherever/whenever the money seems to flow) - databases are bad, you should use couchbase etc, I was a db expert, the people advocating weren't, but they were very loud. The many, many evangelistic web development alternatives that come and go, all very loud. Now the latest - LLM's, like couchbase et al they have their place but the evangelists are not having any of it.
I work a lot with doctors (writing software for them), I am very envious of their system of specialisation, eg this dude is such and such a specialist - he knows about it, listen to him. IT seems to be anyone who talks the loudest has a podium, separating the wheat from the chaff is difficult. One day we will have a system of qualifications I hope, but it seems a long way off.
I feel no strong need to convince others. I've been seeing major productivity boosts for myself and others since Sonnet 3.5. Maybe for certain kinds of projects and use cases it's less good, maybe they're not using it well; I dunno. I do think a lot of these people probably will be left behind if they don't adopt it within the next 3 years, but that's not really my problem.
What's there to be left behind on? That's like arguing people who stick to driving cars with manual transmissions are going to get left behind when buses "inevitably get good."
The whole point of the AI coding thing is that it lets inexperienced people create software. What skill moat are you building that a skilled software developer won't be able to pick up in 20 minutes?
Everyone now is driving automatic, LLMs are the manual transmission in a classic car with "character".
Yes, anyone can step into one, start it and maybe get there, but the transmission and engine will make strange noises all the way and most people just stick to the first gear because the second gear needs this weird wiggle and a trick with the gas pedal to engage properly.
Using (agentic) LLMs as coding assistants is a skill that (at the moment) can't really be taught as it's not deterministic and based a lot on feels and getting the hang of different base models (gemini, sonnet/opus, gpt, GLM etc). The only way to really learn is by doing.
Yes, anyone can start up Google Antigravity or whatever and say "build me a todo app" and you'll get one. That's just the first gear.
To be fair to both sides, it really is hard to tell if we're in the world of
"you'll be left behind if you don't learn crypto" with crypto
or
"you'll be left behind if you don't learn how to drive" with cars
One of those statements is made in good faith, and the other is made out of insecurity. But we'll probably only really be able to tell looking backwards.
As an aside, both those statements were wrong. People who learned to drive well after cars were widely adopted were at no particular disadvantage when and if they decided to adopt the technology. You can see this is true by noting that at this point, no one alive learned to drive when cars first came out.
I don't really see the point in worrying about prompting, and agents and mcps, and skills and all this as a skill to learn. They're trivial if you're already a developer and if it gets good enough that you don't need to know software engineering, there's not going to be anything to learn. Setting it up is a subset of software engineering so it will be able to do that its self once it can solve problems reliably without someone to understand what's going on to check it.
> But doing "prompt-driven development" or "vibe coding" with an Agentic LLM was an incredibly disapointing experience for me. It required an immense amount of baby sitting, for small code changes, made slowly, which were often wrong. All the while I sat there feeling dumber and dumber, as my tokens drained away.
Yeah I find they are useful for large sweeping changes, introducing new features and stuff, mostly because they write a lot of the boilerplate, granted with some errors. But for small fiddly changes they suck, you will have a much easier time doing these changes your self.
"I find LLMs useful as a sort of digital clerk - searching the web for me, finding documentation, looking up algorithms. I even find them useful1 in a limited coding capacity; with a small context and clear guidelines."
I am curious why the author doesn't think this saves them time (i.e. makes them more productive).
I never had terribly high output as a programmer. I certainly think LLMs have helped increased the amount of code that I can write, net total, in a year. Not to superhuman levels or even super-me levels, just me++.
But, I think the total time spent producing code has gone down to a fraction and has allowed me more time to spend thinking about what my code is meant to solve.
I wonder about two things:
1. maybe added productivity isn't going to be found in total code produced, because there is a limit on how much useful code can be produced that is based on external factors
2. do some devs look at the output of an LLM and "get the ick" because they didn't write it and LLM-code is often more verbose and "ugly", even though it may work? (this is a total supposition and not an accusation in any way. i also understand that poorly thought out, overly verbose code comes with problems over time)
The first of those is about taste, and it's real, and engineers with bad taste write unstable buggy systems.
The second of those is about priority. If all you want is functional code, any old thing will do. That's what I do for one-off scripts. But if you plan to support the code at 2am when exposed to production requests on the internet, you need to understand it, which is about legibility and coherence.
I hope you do have taste, and I hope you value more than simple "it works" tests. But it might be worth looking there for why some struggle with LLM output.
For what it's worth, I use coding agents all the time, but almost never accept their output verbatim outside of boilerplate code.
It seems you find LoC as a measure of productivity. This would answer your question as to why the author does not find it makes them more productive. If total output increases, but quality decreases (which in terms of code means more bugs) then has productivity increased or has it stayed the same?
To answer my own question, if you can pump out features faster but turn around and spend more time on bugs than you do previously then your productivity is likely net neutral.
There is a reason LoC as a measure of productivity has been shunned from the industry for many, many years.
I didn't mean to imply LoC as a measurement of productivity. What I really mean is more "amount of useful code produced to a level the human-using-the-llm determines to be useful".
To try and give an example, say that you want to make a module that transforms some data and you ask the LLM to do it. It generates a module with tons of single-layer if-else branches with a huge LoC. Maybe one human dev looks at it and says, "great this solves my problem and the LoC and verbosity isn't an issue even though it is ugly". Maybe the second looks at it and says, "there's definitely some abstraction I can find to make this easier to understand and build on top of."
Depending on the scenario and context, either of them could be correct.
LoC is a terrible metric for comparing productivity of different developers, even before you get to Goodhart's Law.
OTOH, for a given developer to implement a given feature in a given system, at the end of the day, some amount of code has to be written.
If a particular developer finds that AI lets him write code comparable to what he would have written, in lieu of the code he would have written, but faster than he can do it alone, then looking at lines written might actually be meaningful, just in that context.
I also feel like it makes me more productive but measuring software engineering productivity is famously difficult. If there was an easy way to measure it, managers at bigco would have employed it with abandon years ago.
I feel any LLM discussion without mentioning concrete tooling choices are unproductive. Toolings are evolving so fast, many statements that were accurate two months ago are simply wrong today.
It's a lot different to "vibe code" by copypasting crap from a browser window to an editor vs using an agentic LLM with full access to the source and tools to search for documentation, run scripts etc.
And one major thing is language.
Some languages (Rust, React) are so complex and nuanced that LLMs struggle with them - as do humans. Agentic LLMs will eventually solve the problem you've given them but the solution might be a bit wonky.
Compare that to LLMs writing Python or Go. With Go there's just one way to write a loop, it can't get confused with that. The way to write and format the language has been exactly the same since the beginning.
Same with Python, it's pretty lenient on how you write it (objects vs functional) but there are well-estabilished standards on how to do things and it's an old language (34 years btw). Most of Python 2.x is still valid Python 3.
I think it goes further than this. Some people - some developers, even - do not _like_ programming computers. In fact, many hate it. Those people welcome the LLM agent stuff because it delivers the end product without going through the necessary pain (from their pov) of programming.
While I believe this may be true, there are also just people that get more reward from building than from the act of writing code. That doesn't mean they hate writing code, but that the building comes first. I count myself in that camp.
If I can build better/faster with reasonably equal quality, I'll trade off the joy of programming for the joy of more building, of more high level problem solving and thinking, etc.
I've also seen the opposite: those that derive more joy from the programming and the cool engineering than from the product. And you see the opposite behavior from them, of course--such as selecting a solution that's cool and novel to build, rather than the simple, boring, but better alternative.
I often find this type of engineer rather frustrating to work with, and coincidentally, they seem to be the most anti-AI type I've encountered.
Yeah, that's pretty much what I think too. I'm much more of the latter type you mention, but I think I have the enough acumen to be practical most times.
Its always been the case that engineers come in many flavors, some more and some less business-inclined. The difference with AI imo is that it will (or already is) putting its trillion-dollar finger on the scale, such that there is less patience and space for people like me, and more for people like you.
> It required an immense amount of baby sitting, for small code changes, made slowly, which were often wrong.
Can’t speak for others but that’s not what I’d understand (or do) as vibecoding. If you’re babysitting it every inch of the way then yeah sure I can see how it might not be productive relative to doing it yourself.
If you’re constantly fighting the LLM because you have a very specific notion of what each line should look like it won’t be a good time.
Better to spec out some assumptions, some desired outcomes, tech to be used, maybe the core data structure, ask the llm what else it needs to connect the dots, add that and then let it go
This is also bad evangelism, but on opposite side.
Just because LLMs don't work for you outside of vibe-coding, doesn't mean it's the same for everyone.
> LLM evangelists - are you willing to admit that you just might not be that good at programming computers?
Productive usage of LLMs in large scale projects become viable with excellent engineering (tests, patterns, documentation, clean code) so perhaps that question should also be asked to yourself.
It starts from the premise that the author finds LLMs are good for limited, simple tasks with small contexts and clearly defined guidelines, and specifically not good for vibe-coding.
And the author literally mentions that they aren't making universal claims about LLMs, but just speaking from personal experience.
> That doing this is it's own skill and I have not spent enough time with it.
Yeah, this.
I sucked (still sucks?) at it too, I spent countless hours correcting them. And throwing away hours of "work" they made, And even had them nuking the workplace a couple times (thankfully, they were siloed). I still feel like I'm wasting too much time way too often and trying new things constantly.
But I always thought I can learn and improve on this tool and its associated ecosystem as much as the other programming tools and languages and frameworks I learned over the years.
I think that coding assistants tend to quite good as long as what you ask is close to the training data.
Anything novel and the quality if falling off rapidly.
So, if you are like Antirez and ask for a Linenoize improvement that has already be seen many times by the LLM at training time, the result will seem magical, but that is largely an illusion, IMO.
> You see a lot of accomplished, prominent developers claiming they are more productive without it.
Demonstrably impossible if you’re actually properly trying to use them in non-esoteric domains. I challenge anyone to very honestly showcase a non-esoteric domain in which opus4.5 does not make even the most experienced developer more productive.
How would one set this sort of test up? I surely have example domains where LLMs routinely do poorly (for example, custom bazel rules and workspaces), but what would constitute a "showcase" here?
To change my
mind I’ll be satisfied with a thorough description of the domain and ideally a theory on why it does poorly in that domain. But we’re
not talking LLMs here, we’re talking opus4.5 specifically.
A theory besides... not enough training data? Is it even possible to formulate a coherent theory about this? I'm talking about customizing a widely-used build system, not exactly state-of-the-art cryptography. What could I possibly say that you wouldn't counter with "skill issue" (which goes back to the author's point)?
If you say it's demonstrably impossible that someone can't be made more productive with opus4.5, then it should probably be up to you to demonstrate impossibility.
"I am still willing to admit I am wrong. That I'm not holding the GPS properly. That navigating with real-time satellite data is its own skill and I have not spent enough time with it. I have changed how I get around before, and I'm sure I will do so again.
Map-reading evangelists, are you willing to admit that you just might not be that good at driving a car? Maybe you once were. Maybe you never were."
I remember a similarly aggressive evangelism about self-driving cars several years ago. I suppose it's not so pleasant, when you feel like you've seen a prophetic glimpse of a brilliant future, to deal with skeptics who don't understand your vision and refuse to give your predictions the credit they deserve.
Of course we need a few people to get wildly overexcited about new possibilities, so they will go make all the early mistakes which show the rest of us what the new thing can and cannot actually do; likewise, we need most of us to feel skeptical and stick to what already works, so we don't all run off a cliff together by mistake.
I like OP's representation, but I feel like a lot of people arent saying 'LLMs are the bomb dot com _right now_' (though some are), but rather the trend is evident: these things will keep getting better, and the writing is on the wall.
Personally I think the rate of improvement will plateau: in my experience software inevitably becomes less about tech and more about the interpersonal human soup of negotiating requirements, needs, contradiction, and feedback loops, at lot of which is not signal accessible to a text-in-text-out engine.
I just want some externally verifiable numbers. If AI is a 10x improvement, we should be seeing new operating systems. If it’s 5x we should see new game engines. If it’s 2x we should see massive amounts of new features for popular open source projects.
If it’s less than that, then it’s more like adding syntax highlighting or moving from Java to Ruby on Rails. Both of those were nice, but people weren’t breathlessly shouting about being left behind.
I dont think that will happen, Not at least now. I dont even think AI would be suitable for creating an OS or a game engine. What seems to be increasing is no of saas products asking for subscriptions and most of the productivity, if any, is being used to add AI features to the apps.
Until some days / weeks ago, LLM's for coding was more hype than actually real code producing. That is gone now. They clearly leveled up, things will not be the same anymore. And of course this is not just for coding, this is just the beginning. A month ago it really seemed that the models were hitting a complexity wall and that the architecture would need to be improved. Not anymore.
It's different this time. You can see that at the same time, they are finally, definitely solving hard mathematical problems. They passed the phase of just being like good search engines to being actual generators of new data from their generalizations. I can give you a simple example. Any code they generated before brought frustration and they would loop with feedback. Now they actually produce "human level" code.
This is a fun piece to dissect because it's self-aware about being uncharitable, yet still commits the very sin it's criticizing.
The author's central complaint is that LLM evangelists dismiss skeptics with psychological speculation ("you're afraid of being irrelevant"). Their response? Psychological speculation ("you're projecting insecurity about your coding skills").
This is tu quoque dressed up as insight. Fighting unfounded psychoanalysis with unfounded psychoanalysis doesn't refute anything. It just levels the playing field of bad arguments.
The author gestures at this with "I am still willing to admit I am wrong" but the bulk of the piece is vibes-based counter-psychoanalysis, not engagement with evidence.
It's a well-written "no u" that mistakes self-awareness ("I know this isn't charitable") for self-correction.
if A.I maximalist gospel was true - we would see a company raising $10M Series A | Seed (these days)
spend 60% on A.I, 30% on Humans and 10% on operations but I can bet you my sole penny that's not happening - so we know someone is tryna sell us a polished turd as a diamond
At no point in history has humanity ever cut back on spending after some constraint got alleviated. Exact opposite, we always ramp spending up to chase new possibilities.
If A.I maximalism gospel was true we would see companies raising absurd seed and A rounds in record numbers. Which is exactly what we’re seeing
Whatever your personal feeling, judgement, or conviction on this matter; do not dismiss the other side because of a couple wingnuts saying crazy stuff (you can find them on both extremes as well as the middle). Stay curious as to why people have their own conviction, and seek the truth!
I see it only as a threat to those who have a deep hook into their role as a SWE.
If as a SWE you see the oncoming change and adapt to it, no issue.
If as a SWE you see the enablement of LLMs as an existential threat, then you will find many issues, and you will fail to adapt and have all kind of issues related to it.
If a company wants to cut a lot of SWE it wouldn't matter if you have adapted or not, they will just cut as much as they can. How can you adapt more than another SWE? These tools seem easy to learn and there isn't a massive learning curve if you're a programmer. I wonder if changing to a different role would work but I would be skeptical, because after SWEs they will try to cut the other jobs and if there is no one at the bottom, middle management doesn't make sense.
I see how cool and powerful they're getting, but agree there is a huge insecurity element in the evangelism. Everyone wants to be seen as the one who will get a seat running the llms when the music stops playing.
I see a lot more insecurity from people who refuse to use AI coding tools. My teammates amd I use this stuff all the time, and it's not making a statement, it's just an easier path sometimes.
I somewhat agree with this poster. However, I think the unfortunate reality of programming for money is that a mediocre programmer that pumps out millions of lines of slop that seems to drive the business forward and manages to hide disastrous bugs until after the contract / promotion cycle is over will get further ahead than the more competent programmer that delivers better, less buggy, less spaghetti code.
Most of us are paid to solve problems and deliver features, not craft the most perfect code known to man.
If the slop-o-matic next to you is delivering 5 features a week without tripping up QA and you do one every two weeks - which one will the company pick when layoffs hit again?
I mean, isn't driving the business forward really what matters (outside of academia, open source, and other such endeavors). We live in a hyper competitive market. All else being equal, if company A can produce "millions of lines of slop", constantly living on the knife-edge of disaster but not falling over it, they will beat company B that artificially slows themselves down. Up until the point company A implodes, but that's not necessarily a given if pre-LLM companies are any indication.
Huh? Where did I say that's what I like? I'm just trying to discuss for discussion's sake. Personally, I want a world that rewards the people who put their thought, care, and craftsmanship into something more than those that don't. In order to live in that world, I think we need to discuss the parts (maybe the whole) that don't and why that might be.
I don't get who is saying this dreaded "you'll be left behind." The only place I see that is from straight-up slop accounts in the Twitter algo feed. Surely you're not letting those people make you feel bad.
> You see a lot of accomplished, prominent developers claiming they are more productive without it.
You also see a lot of accomplished, prominent developers claiming they are more productive with it, so I don't know what this is supposed to prove. The inverse argument is just as easy to make and just as spurious.
I'm dubbing this "podcast driven development" because so many of them aren't building things to build things, they just want to _have built something_ so they can go on podcasts and talk about how great it is.
For what it's worth, I think most of them are genuine when they say they're seeing 10X gains,they just went from, like, a 0.01X engineer (centi-swe) to to a 0.1X engineer (deci-swe).
"LLM evangelists - are you willing to admit that you just might not be that good at programming computers? Maybe you once were. Maybe you never were."
lol. is this supposed to be like some sort of "gotcha"! yes? like maybe i am a really shitty programmer and always just wanted to hack things together. what it has allowed me to do is prevent burnout to some extent, outsource the "boring" parts and getting back to building things i like.
also getting tired of these extreme takes but whatever, it's #1 so mission accomplished. llms are neither this or that. just another tool in the toolbox, one that has been frustrating in some contexts and a godsend in others and part of that process is figuring out where it excels and doesn't.
hmm, maybe you are not as good at using llms as you think then? lol jk.
i mean if you have imposter syndrome then this feeling will always be prevalent. how do you know what you are good at or not? i might be competent enough to have progressed this far in my career as in "results", but comparison to people i consider "good" devs always puts in that doubt.
i guess it strikes a chord when someone in the same breath of claiming to be open minded makes a backhand comment where people who like llms might just must be a shitty programmer or whatever. i get the point, but that line doesn't quite land the way you think it does.
> It's projection. Their evangelism is born of insecurity.
It's fear, but of different kind. Those who are most aggressive and pushy about it are those who invested too much [someone else's] money in it and are scared angry investors will come for their hides when reality won't match their expectations.
I don't mind weighing in as someone who could fairly be categorized as both an LLM evangelist and "not an experienced dev".
It's a lot like why I've been bullish on Tesla's approach to FSD even as someone who owned an AP1 vehicle that objectively was NOT "self-driving" in any sense of the word: it's less about where the technology is right now, or even the speed the technology is currently improving at, and more about how the technology is now present to enable acceleration in the rate of improvement of performance, paired with the reality of us observing exactly that. Like FSD V12 to V14, the last several years in AI can only be characterized as an unprecedented rate of improvement, very much like scientific advancement throughout human society. It took us millions of years to evolve into humans. Hundreds of thousands to develop language. Tens of thousands to develop writing. Thousands to develop the printing press. Hundreds to develop typewriters. Decades to develop computers. Years to go from the 8086 to the modern workstations of today. The time horizon of tasks AI agents can now reliably perform is now doubling every 4 months, per METR.
Do frontier models know more than human experts in all domains right now? Absolutely not. But they already know far more than any individual human expert outside that human's domain(s) of expertise.
I've been passionate about technology for nearly two decades, working in the technology industry for close to a decade. I'm a security guy, not a dev. I have over half a dozen CVEs and countless private vuln disclosures. I can and do write code myself - I've been writing scripts for various network tasks for a decade before ChatGPT ever came into existence. That said, it absolutely is a better dev than me. But specialized harnesses paired with frontier models are also better security engineers than I am, dollar for dollar versus my cost. They're better pentesters than me, for the relative costs. These statements were not true at all without accounting for cost two years ago. Two years from now, I am fully expecting them to just be outright better at security engineering, pentesting, SCA than I am, without accounting for cost, yet I also expect they will cost less then than they do now.
A year ago, OpenAI's o1 was still almost brand new, test-time compute was this revolutionary new idea. Everyone thought you needed tens of billions to train a model as good as o1, it was still a week before Deepseek released R1.
Now, o1's price/performance seems like a distant bad dream. I had always joked that one quarter in tech saw as much change as like 1 year in "the real world". For AI, it feels more like we're seeing more change every month than we do every year in "the real world", and I'd bet on that accelerating, too.
I don't think experienced devs still preferring to architect and write code themselves are coping at all. I still have to fix bugs in AI-generated code myself. But I do think it's short sighted to not look at the trajectory and see the writing on the wall over the next 5 years.
Stanford's $18/hr pentester that outperforms 9/10 humans should have every pentester figuring out what they're going to be doing when it doubles in performance and halves in cost again over the next year, just like human Uber drivers should be reading Motortrend's (historically a vocal critic of Tesla and FSD) 2026 Best Driver Assistance System and figuring out what they're going to do next. Experienced devs should be looking at how quickly we came from text-davinci-003 to Opus 4.5 and considering what their economic utility will look like in 2030.
Yeah, I'm being a little generous/conservative here, but also, that 2030 estimate is more along the lines of the "everyone unambiguously understands AI is better than the experts in their respective domains", not for the much sooner "it becomes more economically viable to have AI devs than human devs".
See the same thing in the bitcoin space. If you ask them to explain the value to you, you're a moronic, behind-the-times, luddite boomer who just doesn't understand. Not to mention poor!
I'll remain skeptical and let the technology speak for itself, if it ever does.
Can we just agree that both the pro- and anti-llm faction mostly contribute noise? And go back to discuss actual achievements?
It's trivial to share coding sessions, be they horrific or great. Without those, you're hot air on the internet, independent of whatever specific opinions on LLMs you voice.
LLMs are really great at copy/pasting answers from stack overflow and fitting them to work in a given system. If your work is outside what is answerable on stack overflow you're going to end up fighting the results constantly.
Front end pages like a user settings page? Done. One shottable.
Nuanced data migration problems specific to your stack? You're going to be yelling at the agent.
> LLM evangelists - are you willing to admit that you just might not be that good at programming computers? Maybe you once were. Maybe you never were.
A bit harsh considering that many of us used knowledge bases like SO for so long to figure out new problems that we were confronting.
> Front end pages like a user settings page? Done. One shottable.
This is only one shottable if you are high paced startup or you don't care enough. In real world software, you would need to make it accessible, store data in a complaint way, hook up translations, make sure all inputs are validated and do some usability testing.
From my heavy experience using every frontier model for a year now, LLMs are actually probably much, much better at nuanced data migration problems specific to your stack than at a frontend user settings page. (Though still pretty good at both. And the user settings page will work, sure.)
"And it made me think - why are these people so insistent, and hostile? Why can't they live and let live? Why do they need to convince the rest of us?"
Same could be said about the anti-AI crowd.
I'm glad the author made the distinction that he's talking about LLMs, though, because far too many people these days like to shout from the rooftops about all AI being bad, totally ignoring (willfully or otherwise) important areas it's being used in like cancer research.
This doesn't feel completely right.
Simon Wilson (known for Django) has been doing a lot of LLM evangelism on his blog these days. Antirez (Redis) wrote a blog post recently with the same vibe.
I doubt they are not good programmers. They are probably better than most of us, and I doubt they feel insecure because of the LLMs. Either I'm wrong, or there's something more to this.
edit: to clarify, I'm not saying Simon and Antirez are part of the hostile LLM evangelists the article criticizes. Although the article does generalize to all LLM evangelists at least in some parts and Simon did react to this here. For these reasons, I haven't ruled him out as a target of this article, at least partly.
The author claims it's not just that one evangelizes it, but that they become hostile when someone claims to not have the same experience in response. I don't recall Either Willison or Antirez scaring people by saying they will be left behind or that they are just afraid of becoming irrelevant. Instead they just talk about their positive experiences using it. Willison and Antirez seem to be fine to live and let live (maybe Antirez a bit less, but they're still not mean about it).
My gut says that is not a property of LLM evangelists, but a property of current internet culture in general. People with strong, divisive, and engaging opinions seem to do well (by some definition of well) online.
It's weird how some people seem to treat using an LLM as part of their personality in a borderline cult like way. So someone saying they don't use it or don't find it useful triggers an anger response in them.
That is not novel - see language/framework choice, OS (or even distro) preferences, editor wars, indentation. People develop strong opinions about tools, technology, and techniques regardless of domain. LLM maximalists just have the unfortunate capability to generate infinite content about their specific shiny thing.
This. For every absurd LLM cheerleader, there’s a corresponding LLM minimalist who trots out the “stochastic parrot” line at every possible occasion along with the fact that they do CrossFit and don’t own a TV.
I think the actual problem is everyone tries to assert how capable or not coding agents currently are, but how useful they are depends so much on what you are trying to get them to do and also on your communication with the model. And often its hard to tell whether you're just prompting it wrong or if they're incapable of doing it.
By now we at least agree that stochastical parrots can be useful. It would be nice if the debate now was less polarized so we could focus on what makes them work better for some and worse for others other than just expectations.
Thanks for clarifying for people.
And yeah, as I laid out in the article (that of course, very few people actually read, even though it was short...), I really don't mind how people make code. It's those that try so hard to convince the rest of us I find very suspect.
In my case I don’t even mind if these evangelists try so hard to convince other developers. What I do mind is that they seem to be quite successful in convincing our bosses. So we get things like mandatory LLM usage, minimum number of Claude API calls per day, every commit must be co-authored by Claude, etc.
That sounds horrible.
I'll convince you one day
You seem to be mistaking set and members here. The piece's critique is against the set (LLM Evangelists), not against specific members of the set (the ones you mentioned). One can agree with the point of the piece while still acknowledging there are good programmers who are also LLM evangelists.
You are right, I went a bit too quickly so let me expand a bit my chain of thoughts.
The article is against the set of LLM evangelists who are hostile towards the skeptics.
I 100% agree with the part that basically says fuck you to them.
However, explaining the hostile part with there being the feeling of insecurity (which is plausible but would need evidence) is not fully convincing and it seems dangerous to accept this conclusion and stop looking for the actual reasons this quickly.
And the fact that there are actually good programmers persuaded that LLMs help them weakens the "insecurity" argument quite a bit, at least as the only explanation.
As someone currently pretty much hostile to LLMs, I'm quite interested in what's currently at play but I'm suspicious of claims that initially feel good but are not strongly backed.
Like, if these hostile people were actually shills, we would want to know this and not have closed the eyes too early because of some explanation that felt good, right? Or any actual reason.
That is the basic argument for the utility of stereotypes too. I think the author is engaged in projection. I think the hecklers are the ones revealing their insecurities, which are justified. Even if AI progress were frozen today, the market conditions are going to change in ways that are hard to predict and a little bit of programming knowledge is not going to be the massive arbitrage opportunity that it used to be.
Look into the Motte-and-bailey fallacy. That set seriously doesn't exist. Even the people on youtube doing "vibe coding" benchmarks mostly say it's crap. (Well they probably exist on twitter/linkedin.)
This article just functions as flamebait for people who use LLMs to implement whole features to argue the semantics of "vibe coding". All while everyone is ignoring the writing on the wall. That we will soon have boxes going through billions of tokens every second. At that point slopcoding WILL be productive, but only if you build up the skill to differentiate yourself from the top 10% of prompters.
Simon just explores stuff and writes about them. Doesn't mean he uses LLMs for everyting.Antirez likes to question stuff and make them better. Doesn't mean he uses LLMs for everything.
Also their experience is not my experience. I will make my own choices.
I don't think it's fair to compare people who are already seemingly at the peak of their career, a place to which they got by building skill in coding. And in fact what they have now that's valuable isn't mostly skill but capital. They've built famous software that's widely used.
Also they didn't adopt the your-career-is-ruined-if-you-don't-get-on-board tone that is sickeningly pervasive on LinkedIn. If you believe that advice and give up on being someone who understands code, you sure aren't gonna write Redis or Django.
> They are probably better than most of us
most top engineers will have their best work locked up in their employer's private repositories
simonw and antirez have an advantage here, and at least the former is very good at self-promotion
It's a tactical error to disagree with influencers in public, much less actually criticize them, since it only arouses the mob. When there's a power difference you have to cede the platform (since they already have it), and try to ask good questions.
Questions like.. did we really even need to invoke particular influencers to discuss this issue? Why does that come up at all, and why is it the top comment? If names and argument from authority can settle issues on HN now, does it work with all credentialed authorities, or only those vocal few with certain opinions?
I suspect for truly talented people, they just like talking to LLMs. And they're also not 100% focused on programming anymore, so the async nature of it matters more than it does to people who write code full time for a living.
Hearing people on tech twitter say that LLMs always produce better code than they do by hand was pretty enlightening for me.
LLMs can produce better code for languages and domains I’m not proficient in, at a much faster rate, but damn it’s rare I look at LLM output and don’t spot something I’d do measurably better.
These things are average text generation machines. Yes you can improve the output quality by writing a good prompt that activates the right weights, getting you higher quality output. But if you’re seeing output that is consistently better than what you produce by hand, you’re probably just below average at programming. And yes, it matters sometimes. Look at the number of software bugs we’re all subjected to.
And let’s not forget that code is a liability. Utilizing code that was “cheap” to generate has a cost, which I’m sure will be the subject of much conversation in the near future.
> These things are average text generation machines.
Funny... seems like about half of devs think AI writes good code, and half think it doesn't. When you consider that it is designed to replicate average output, that makes a lot of sense.
So, as insulting as OP's idea is, it would make sense that below-average devs are getting gains by using AI, and above-average devs aren't. In theory, this situation should raise the average output quality, but only if the training corpus isn't poisoned with AI output.
I have an anecdote that doesn't mean much on its own, but supports OP's thesis: there are two former coworkers in my linkedin feed who are heavy AI evangelists, and have drifted over the years from software engineering into senior business development roles at AI startups. Both of them are unquestionably in the top 5 worst coders I have ever worked with in 15 years, one of them having been fired for code quality and testing practices. Their coding ability, transition to less technical roles, and extremely vocal support for the power of vibe coding definitely would align with OP's uncharitable character evaluation.
> it would make sense that below-average devs are getting gains by using AI
They are certainly opening more PRs. Being the gate and last safety check on the PRs is certainly driving me in the opposite direction.
After a certain experience level though, I think most of us get to the point of knowing what that difference in quality actually matters.
Some seniors love to bikeshed PRs all day because they can do it better but generally that activity has zero actual value. Sometimes it matters, often it doesn't.
Stop with the "I could do this better by hand" and ask "is it worth the extra 4 hours to do this by hand, or is this actually good enough to meet the goals?"
LLM generated code is technical debt. If you are still working on the codebase the next day it will bite you. It might be as simple as an inconvenient interface, a bunch of duplicated functions that could just be imported, but eventually you are going to have to pay it.
All code is technical debt though. We can't spend infinite hours finding the absolute minima of technical debt introduced for a change, so it is just finding the right balance. That balance is highly dependent on a huge amount of factors: how core is the system, what is the system used for, what stage of development is the system, etc.
Untested undocumented LLM code is technical debt, but if you do specs and tests it's actually the opposite, you can go beyond technical debt and regenerate your code as you like. You just need testing to be so good it guarantees the behavior you care about, and that is easier in our age of AI coding agents.
> but if you do specs and tests it's actually the opposite, you can go beyond technical debt and regenerate your code as you like.
Having to write all the specs and tests just right so you can regenerate the code until you get the desired output just sounds like an expensive version of the infinite monkey theorem, but with LLMs instead of monkeys.
You can have it write the specs and tests, too, and review and refine them much faster than you could write them.
Are people not reviewing and refactoring LLM code?
In your comment replace “LLM” with “Human SWE” and statement will still be correct in vast majority of the situations :)
That's legit true. All code is technical debt. Human SWEs have one saving grace. Sometimes they refactor and reduce some of the debt.
A human SWE can use an LLM to refactor and reduce some of the debt just as easily too. I think fundamentally, the possible rate of new code and new technical debt introduced by LLMs is much higher than a human SWE. Left unchecked, a human still needs sleep and more humans can't be added with more compute.
There's an interesting aspect to the LLM debt being taken on though in that I'm sure some are taking it on now in the bet/hopes that further advancements in LLMs will make it more easily addressable in the future before it is a real problem.
"actually good enough to meet the goals?"
There's "okay for now" and then there's "this is so crap that if we set our bar this low we'll be knee deep in tech debt in a month".
A lot of LLM output in the specific areas _I_ work in is firmly in that latter category and many times just doesn't work.
So I can tell you don’t use these tools, or at least much, because at the speed of development with them you’ll be knee deep in tech debt in a day, not a month, but as a corollary can have the same agentic coding tools undergo the equivalent of weeks of addressing tech debt the next day. Well, I think this applies to greenfield AI-first oriented projects that work this way from the get go and with few humans in the loop (human to human communication definitely becomes the rate limiting step). But I imagine that’s not the nature of your work.
I think you missed the your parent post's phrase "in the specific areas _I_ work in" ... LLMs are a lot better at crud and boilerplate than novel hardware interfaces and a bunch of other domains.
But why would it take a month to generate significant tech debt in novel domains, it would accrue even faster then right? The main idea I wanted to get across is that iteration speed is much faster so what's "tech debt" in the first pass, can be addressed much faster in future passes, which will happen on the order of days rather than sprints in the older paradigm. Yes the first iterations will have a bunch of issues but if you keep your hands on the controller you can get things to a decent state quickly. I think one of the biggest gaps I see in devs using these tools is what they do after the first pass.
Also, even for novel domains, using tools like deep research and the ability of these tools to straight up search through the internet, including public repos during the planning phase (you should be planning first before implementing right? You're not just opening a window and asking in a few sentences for a vaguely defined final product I hope) is a huge level up.
If there are repos, papers, articles, etc of your novel domain out there, there's a path to a successful research -> plan -> implement -> iterate path out there imo, especially when you get better at giving the tools ways to evaluate their own results, rather than going back and forth yourself for hours telling them "no, this part is wrong, no now this part is wrong, etc etc"
I mean, there's also, "this looks fine but if I actually had written this code I would've naturally spent more time on it which would have led me to anticipate the future of this code just a little bit more and I will only feel that awkwardness when I come back to this code in two weeks, and then we'll do it all over again". It's a spectrum.
Perhaps writing code by hand will be considered micro optimisation in the future.
Just like writing assembly is today.
now sometimes that's 4 hours, but I've had plenty of times where I'm "racing" people using LLMs and I basically get the coding done before them. Once I debugged an issue before the robot was done `ls`-ing the codebase!
The shape of the problem is super important in considering the results here
People usually talk about how they're better than LLMs in the domains they're experts and with known codebases.
What about all the other, large amounts of cases? Don't you ever face situations in which an LLM can greatly help (and outrace) you?
You have the upper hand with familiarity of the code base. Any "domain expert" also necessarily has a head start knowing which parts of a bespoke complex system need adjustment when making changes.
On the other hand, a highly skilled worker who just joined the team won't have any of that tribal knowledge. There is a significant lag time getting ramped up, no matter how intelligent they are due to sheer scale (and complexity doesn't help).
A general purpose model is more like the latter than the former. It would be interesting to compare how a model fine tuned on the specific shape of your code base and problem domain performs.
I've been playing with vibe coding a lot lately and I think in most cases, the current SOTA LLM's don't produce code that I'd be satisfied with. I kind of feel like LLM's are really really good at hacking on a messy and fragile structure, because they can "keep track many things in their head"
BUT
An LLM can write a PNG decoder that works in whatever language I choose in one or a few shots. I can do that too, but it will take me longer than a minute!
(and I might learn something about the png format that might be useful later..)
Also, us engineers can talk about code quality all day, but does this really matter to non-engineers? Maybe objectively it does, but can we convince them that it does?
> Maybe objectively it does, but can we convince them that it does?
how long would you give our current civilisation if quality of software ceased to be important for:
unless "AI" dies, we're going to find outI kinda like Theo's take on it (https://www.youtube.com/watch?v=Z9UxjmNF7b0): there's a sliding scale of how much slop should reasonably be considered acceptable and engineers are well advised to think about it more seriously. I'm less sold on the potential benefits (since some of the examples he's given are things that I would also find easy by hand), but I agree with the general principle that having the option to do things in a super-sloppy way, combined with spending time developing intuition around having that access (and what could be accomplished that way), can produce positive feedback loops.
In short: when you produce the PNG decoder, and are satisfied with it, it's because you don't have a good reason to care about the code quality.
> Maybe objectively it does, but can we convince them that it does?
I strongly doubt it, and that's why articles like TFA project quite a bit of concern for the future. If non-engineers end up accepting results from a low-quality, not-quite-correct system, that's on them. If those results compromise credentials, corrupt databases etc., not so much.
I tried vibe coding a BMP decoder not too long ago with the rationale being “what’s simpler than BMP?”
What I got was an absolute mess that did not work at all. Perhaps this was because, in retrospect, BMP is not actually all that simple, a fact that I discovered when I did write a BMP decoder by hand. But I spent equal time vibe coding and real coding. At the end of the real coding session, I understood BMP, which I see as a benefit unto itself. This is perhaps a bit cynical but my hot take on vibe coders is that they place little value on understanding things.
Mind explaining the process you tried? As someone who’s generally not had any issue getting LLMs to sort out my side projects (ofc with my active involvement as well), I really wonder what people who report these results are trying. Did you just open a chat with claude code and try to get a single context window to one shot it?
Just out of curiousity (as someone fairly familiar with the BMP spec, and also PNG incidentally): what did you find to be the trickiest/most complex aspects?
None of this is fresh in my mind, so my recollections might be a little hazy. I think the only issue I personally had when writing a decoder was keeping the alignment of various fields right. I wrote the decoder in C# and if I remember correctly I tried to get fancy with some modern-ish deserialization code. I think I eventually resorted to writing a rather ugly but simple low-level byte reader. Nevertheless I found it to be a relatively straightforward program to write and I got most of what I wanted done in under a day.
The vibe coded version was a different story. For simplicity, I wanted to stick to an early version of BMP. I don’t remember the version off the top of my head. This was a simplified implementation for students to use and modify in a class setting. Sticking to early version BMPs also made it harder for students to go off-piste since random BMPs found on the internet probably would not work.
The main problem was that the LLM struggled to stick to a specific version of BMP. Some of those newer features (compression, color table, etc, if I recall correctly) have to be used in a coordinated way. The LLM made a real mess here, mixing and matching newer features with older ones. But I did not understand that this was the problem until I gave up and started writing things myself.
You should try a similar task with a recent model (Opus 4.5, GPT 5.2, etc) and see if there are any improvements since your last attempt. I also encourage using a coding agent. It can use test files that you provide, and perform compile-test loops to work out any mistakes.
It sounds like you used an older model, and perhaps copy-pasted code from a chat session. (Just guessing, based on what you described.)
Claude Code is way better than I am at rummaging through Git history, handling merge conflicts, renaming things, writing SQL queries whose syntax I always forget (window functions and the like). But yeah. If I give it a big, non-specific task, it generates a lot of mediocre (or worse) code.
That's funny that's all the things I don't trust it to do. I actually use it the other way around, give it a big non-specific task, see if it works, specify better, retry, throw away 60% - 90% of the generated code, fix bugs in a bunch of places and out comes an implemented feature.
Agreed. Claude is horrible at munging git history and can destroy the thing I depend on to fix Claude's messes. I always do my git rebasing by hand.
The first iteration of Claude code is usually a big over-coded mess, but it's pretty good at iterating to clean it up, given proper instruction.
I give the agent the following standing instructions:
"Make the smallest possible change. Do not refactor existing code unless I explicitly ask."
That directive cut down considerably on the amount of extra changes I had to review. When it gets it right, the changes are close to the right size now.
The agent still tries to do too much, typically suggesting three tangents for every interaction.
As someone who: 1.) Has a brain that is not wired to think like a computer and write code. 2.) Falls asleep at the keyboard while writing code for more than an hour or two. 3.) Has a lot of passion for sticking with an idea and making it happen, even if that means writing code and knowing the code is crap.
So, in short, LLMs write better code than I do. I'm not alone.
You are defective. But, fear not! So is the rest of humanity!
LLMs are not "average text generation machines" once they have context. LLMs learn a distribution.
The moment you start the prompt with "You are an interactive CLI tool that helps users with software engineering at the level of a veteran expert" you have biased the LLM such that the tokens it produces are from a very non-average part of the distribution it's modeling.
True, but nuanced. The model does not react to "you are an experienced programmer" kinds of prompts. It reacts to being given relevant information that needs to be reflected in the output.
See examples in https://arxiv.org/abs/2305.14688; They certainly do say things like "You are a physicist specialized in atomic structure ...", but the important point is that the rest of the "expert persona" prompt _calls attention to key details_ that improves the response. The hint about electromagnetic forces in the expert persona prompt is what tipped off the model to mention it in the output.
Bringing attention to key details is what makes this work. A great tip for anyone who wants to micromanage code with an LLM is to include precise details about what they wish to micromanage: say "store it in a hash map keyed by unsigned integers" instead of letting the model decide which data structure to use.
I worked for a relatively large company (around 400 employees there are programmers). The people who embraced LLM-generated code clearly share one trait: they are feature pushers who love to say "yes!" to management. You see, management is always right, and these programmers are always so eager to put their requirements, however incomplete, into a Copilot session and open a pull request as fast as possible.
The worst case I remember happened a few months ago when a staff (!) engineer gave a presentation about benchmarks they had done between Java and Kotlin concurrency tools and how to write concurrent code. There was a very large and strange difference in performance favoring Kotlin that didn't make sense. When I dug into their code, it was clear everything had been generated by a LLM (lots of comments with emojis, for example) and the Java code was just wrong.
The competent programmers I've seen there use LLMs to generate some shell scripts, small python automations or to explore ideas. Most of the time they are unimpressed by these tools.
It'd be rather surprising if you could train an AI on a bunch of average code, and somehow get code that's always above average. Where did the improvement come from?
We should feed the output code back in to get even better code.
AI generally can improve through reinforcement learning, but this requires it to be able to compare its output to some form of metric. There aren't a lot of people I'd trust to RLHF for code quality, and anything more automated than that is destined to collapse due to Goodhart's Law.
In my view they’re great for rough drafts, iterating on ideas, throwaway code, and pushing into areas I haven’t become proficient in yet. I think in a lot of cases they write ok enough tests.
How are you judging that you'd write "better" code? More maintainable? More efficient? Does it produce bugs in the underlying code it's generating? Genuinely curious where you see the current gaps.
For me the biggest gaps in LLM code are:
- it adds superfluous logic that is assumed but isn’t necessary
- as a result the code is more complex, verbose, harder to follow
- it doesn’t quite match the domain because it makes a bunch of assumptions that aren’t true in this particular domain
They’re things that can often be missed in a first pass look at the code but end up adding a lot of accidental complexity that bites you later.
When reading an unfamiliar code base we tend to assume that a certain bit of logic is there for a good reason, and that helps you understand what the system is trying to do. With generative codebases we can’t really assume that anymore unless the code has been thoroughly audited/reviewed/rewritten, at which point I find it’s easier to just write the code myself.
This has been my experience as well. But, these are things we developers care about.
Coding aside, LLM's aren't very good at following nice practices in general unless explicitly prompted to. For example if you ask an LLM to create an error modal box from scratch, will it also implement the ability to select the text, or being able to ctrl c to copy the text, or perhaps a copy message button? Maybe this is a bad example, but they usually don't do things like this unless you explicitly ask them to. I don't personally care too much about this, but I think it's noteworthy in the context of lay people using LLM's to vibe code.
I've seen a lot of examples where it fails to take advantage of previous work and rewrites functionality from scratch.
This has been my experience as well.
It's let me apply my general knowledge across domains, and do things in tech stacks or languages I don't know well. But that has also cost me hours debugging a solution I don't quite understand.
When working in my core stack though it's a nice force multiplier for routine changes.
>When working in my core stack though it's a nice force multiplier for routine changes.
what's your core stack?
> Hearing people on tech twitter say that LLMs always produce better code than they do by hand was pretty enlightening for me.
That's hilarious LLM code is always very bad. It's only merit is it occasionally works.
> LLMs can produce better code for languages and domains I’m not proficient in.
I am sure that's not true.
I think it says more about who's still on tech twitter vs. anything about the llm....
It seems true by construction. If you're not proficient in a language than the bar for "better than you" is necessarily lower.
> if you’re seeing output that is consistently better than what you produce by hand, you’re probably just below average at programming
even though this statement does not mathematically / statistically make sense - vast majority of SWEs are “below average.” therein lies the crux of this debate. I’ve been coding since the 90’s and:
- LLM output is better than mine from the 90’s
- LLM output is better than mine from early 2000’s
- LLM output is worse than any of mine from 2010 onward
- LLM output (in the right hands) is better than 90% of human-written code I have seen (and I’ve seen a lot)
The most prolific coders are also more competent than average. Their proliferations are what have trained these models. These models are trained on incredibly successful projects written by masters of their fields. This is usually where I find the most pushback is that the most competent SWEs see it as theft and also useless to them since they have already spend years honing skills to work relentlessly and efficiently towards solutions -- sometimes at great expense.
I'd assume most of the code visible on the web leans amateur. A huge portion of github repos seem to be from students these days. You'll see GitHub's Education page listing "5 million students" (https://github.com/education), which I assume is an under-estimate, as that's only the formal program.
> The most prolific coders are also more competent than average
This is absolutely not true lol, as anyone who's worked with a fabled 10X engineer will tell you. It's like saying the best civil engineer is the one that builds the most bridges.
The best code looks real boring.
I've worked with a 10x engineer and indeed they were significantly more competent than the rest of the team in their execution and design. They've seen so many problems and had a chance to discard bad patterns and adopt/try out new ones.
If firing up old coal plants and skyrocketing RAM prices and $5000 consumer GPUs and violating millions of developers' copyrights and occasionally coaxing someone into killing themselves is the cost of Brian Who Got Peter Principled Into Middle Management getting to enjoy programming again instead of blaming his kids for why he watches football and drinks all weekend instead of cultivating a hobby, I guess we have no choice but to oblige him his little treat.
I saw someone refer to it as future digital asbestos.
> And then, inevitably, comes the character evaluation, which goes something like this:
I saw a version of this yesterday where a commenter framed LLM-skepticism as a disappointing lack of "hacker" drive and ethos that should be applied to making "AI" toolchains work.
As you might guess, I disagreed: The "hacker" is not driven just by novelty in problems to solve, but in wanting to understand them on more than a surface layer. Messing with kludgy things until they somehow work is always a part of software engineering... but the motive and payoff comes from knowing how things work, and perceiving how they could work better.
What I "fear" from LLMs-in-coding is that they will provide an unlimited flow of "mess around until it works" drudgery tasks with none of the upside. The human role will be hammering at problems which don't really have a "root cause" (except in a stochastic sense) and for which there is never any permanent or clever fix.
Would we say someone is "not really an artist" just because they don't want to spend their days reviewing generated photos for extra-fingers, circling them, and hitting the "redo" button?
I share your fear.
We have a hard enough time finding juniors (hell, non-juniors) that know how to program and design effectively.
The industry jerking itself off over Leetcode practice already stunted the growth of many by having them focus on rote memorization and gaming interviews.
With ubiquitous AI and all of these “very smart people” pushing LLMs as an alternative to coding, I fear we’re heading into an era where people don’t understand how anything works and have never been pushed to find out.
Then again, the ability of LLMs to write boilerplate may be the reset that we need to cut out all of the people that never really had an interest in CS that have flocked to the industry over the last decade or so looking for an easy big paycheck.
> to cut out all of the people that never really had an interest in CS
I had assumed most of them had either filtered out at some stage (an early one being college intro CS classes), ended up employed somewhere that didn't seem to mind their output, or perpetually circle on LinkedIn as "Lemons" for their next prey/employer.
My gut feeling is that messy code-gen will increase their numbers rather than decrease them. LLMs make it easier to generate an illusion of constant progress, and the humans can attribute the good parts of the output to themselves, while blaming bad-parts on the LLM.
> What I "fear" from LLMs-in-coding is that they will provide an unlimited flow of "mess around until it works" drudgery tasks with none of the upside.
I feel like its very true to the hacker spirit to spend more time customizing your text editor than actually programming, so i guess this is just the natural extension.
Even when 100% issue-oriented (that is, spending no time on editor-customizatons or developing other skill and toolkits) consider the difference between:
1. This thing at work broke. Understand why it broke, and fix it in a way which stays and has preventative power. In the rare case where the cause is extremely shallow, like a typo, at least the fix is still reliable.
2. This thing at work broke. The LLM zigged when it should have zagged for no obvious reason. There is plausible-looking code that is wrong in a way that doesn't map to any human (mis-)understanding. Tweak it and hope for the best.
There’s plenty of understanding we need to get in order to learn to steer the agents precisely, rather than, as you put it, mess around until it works. Some people are actively working on it, while others make a point of looking the other way.
The anti-LLM side seems much more insecure. Pro-LLM influencers are sometimes corny, but it's sort of like any other influencer, they are incentivized to make everything sound exciting to get clicks. Nobody was complaining about 3d printer influencers raving about how printing replacement dishwasher parts was going to change everything.
LLMs have also become kind of a political issue, except only the "anti" side even really cares about it. Given that using and prompting them is very much a garbage in/garbage out scenario, people let their social and political biases cloud their usage, and instead of helping it succeed, they try to collect "gotcha" moments, which doesn't reflect the workflow of someone using an LLM productively.
I think the author slips into the same pattern he’s criticizing. He says LLM fans shouldn’t label skeptics as “afraid” then he turns around and labels the fans as “insecure” or “not very good at programming.” It’s the same move; guessing what’s going on in someone’s head instead of sticking to what actually happened and what the tools can or can’t do. The simpler truth is LLMs are great in some cases and painful in others. They shine on boilerplate and tests. They struggle when the domain is unusual, requirements are fuzzy; mistakes are made, you pay a big babysitting tax.
Instead of psychoanalyzing each other, people should share concrete examples
"LLM evangelists - are you willing to admit that you just might not be that good at programming computers?"
No.
I do wonder if C programmers ever asked that of Python devs back in the day.
Back in the day, Python devs commonly were C programmers.
Someone had to do the implementation, after all. And the C API was (and still is) kind of a big deal.
There's a reason the standard library is full of direct ports of C libraries with unsightly, highly un-Pythonic names and APIs. (Of course, it's also full of direct ports of Java libraries with unsightly, highly un-Pythonic architecture.)
This is still a thing today. There have been multiple times I oneshot some project that leadership had been waiting on some team forever to finish, and 90% of it was them refusing to touch a "noob" lang like Python or JS.
Any good engineer can become a good engineer in any language.
Except brainfuck and Haskell.
Still do
No, LLM evangelists will not be willing to admit this in general,
or
no, you, as an LLM evangelist, are not not willing to admit this?
The second.
Agreed, but do you honestly think LLMs have reached the level of average programmer? Or is it more a matter of "they can churn out code until I see something that is close enough and I'll make the last few edits"?
Also curious if you publish your working setup or if it changes as fast as the LLMs? Seems like you may have a more stable setup than most given how you are developing tools in the space.
I still do not see LLMs as replacements for programmers - they're tools for programmers to direct. If you don't know anything about programming you might be able to get a vibe coded prototype or simple tool out of them but that's a very different thing from a what happens when a skilled software developer uses these things to help accelerate their work.
My current setup is mainly Claude Code CLI on macOS and Claude Code for web driven by the iPhone all and macOS desktop app. I occasionally use Codex CLI too.
I expect I'll be on a different default combo of tools within a month or two.
Honestly? LLMs are currently above average at programming.
We've all been through The Daily WTF at least once. That's representative of the average. (Although some examples are more egregious than others.)
Came here to comment on this line: it completely changes the tone of the article. It's fairly reasonable and neutral until we get here, upon which the antagonism is jarringly clear.
In fact I would posit this is the central crux of the post: OP does not believe those LLM evangelists were ever good programmers.
As others have already noted[1], many well-known excellent programmers - including yourself! and now even Linus! - would beg to differ.
[1] https://news.ycombinator.com/item?id=46610143
Linus doesn't seem like an LLM evangelist: "Linus Torvalds is OK with vibe coding as long as it's not used for anything that matters" at https://www.theregister.com/2025/11/18/linus_torvalds_vibe_c...
You can stop questioning yourself early whether you are a good programmer just by realizing you code in python.
How much longer until we get to just... let the results speak for themselves and stop relitigating an open question with no clear answer.
We're well past ad nauseum now. Let's talk about anything else.
Given how much energy LLMs use, I'd greatly prefer not to let the results speak for themselves.
Quick napkin math time!
Steam reached a new peak of 42 million concurrent players today [1]. An average/mid-tier gaming PC uses 0.2 kWh per hour [2]. 42 million * 0.2 gives 8,400,000 kWh per hour, or 8,400 MWh per hour.
By contrast, training GPT3 was estimated to have used 1,300 MWh of energy [3].
This does not account for training costs of newer models, nor inference costs. But we know inference costs are extraordinarily inexpensive and energy efficient [2]. The lowest estimate of energy cost for 1 hour of Steam's peak concurrent player count uses 6.5x more energy than all of the energy that went into training GPT3.
[1]: https://www.gamespot.com/articles/steam-has-already-set-a-ne...
[2]: https://jamescunliffe.co.uk/is-gen-ai-bad-for-the-environmen...
[3]: https://www.theverge.com/24066646/ai-electricity-energy-watt...
How many lives would AI have to save for you to say the energy cost is worth it?
I see no point in making this a numbers game. (Like, I was supposed to say "five" or something?)
Let's make it more of a category thing: when AI shows itself responsible for a new category of life-saving technique, like a cure for cancer or Alzheimer's, then I'd have to reconsider.
(And even then, it will be balanced against rising sea levels, extinctions, and other energy use effects.)
> when AI shows itself responsible for a new category of life-saving technique, like a cure for cancer or Alzheimer's, then I'd have to reconsider.
We’re way past that
Please, go into detail!
How many lives have been saved by AI? How many lives have been lost because of it?
Not what I’m asking. But idk, do you have stats? I wouldn’t say _lost_ as a ding against, _ruined_ or _negatively impacted_ is sufficiently a problem
Far less than you'd think for local LLMs.
Local LLMs that you can run on consumer hardware don't really do anything though. They are amusing, maybe you could use them for basic text search, but they don't have any real knowledge like the hosted ones do.
The tech industry seems to attract people that feel personally attacked when someone else makes different choices that they do.
"Why are you using Go? Rust is best! You should be using that!" "Don't use AWS CDK, use Terraform! Don't you know anything?"
It's not just the tech industry, this is a fundamental feature of humans as social animals.
https://knowyourmeme.com/videos/433740-just-coffee-black
> want to feel normal, to walk around and see that most other people made the same choice they made
You need to be fairly smart to be in tech. People who grew up smart and were told they were tend to view it as part of their self worth. If someone disagrees with this person later on, their self with has been attacked so of course they are going to lash out.
The worst thing you can say to a dev is they are wrong. Most will do everything in their power to prove otherwise, even on the dumbest of topics.
Humans are tribal, which has both benefits and costs.
In technology, the historical benefits of evangelizing your favorite technology might just be that it becomes more popular and better supported.
Even though LLMs may or may not follow the same path, if you can get your fellow man on-board, then you'll have a shared frame of reference, and someone to talk through different usage scenarios.
5 anti-AI posts on the home page of Hacker News…yeah, plenty of insecure evangelism amongst the skeptics, too.
Is there enough of new blood on HN? For me it was the best place, my favorite website, when I was entering startup scene. Loved it. I don't think a lot of young founders I know ever go here...
> young founders
While this place has always been attractive to people building startups, back in the day (my original account is from 2009) "Hacker" News was much more about Hackers. Most people posting here had read "On Lisp", respected Paul Graham as a programmer and were enthusiastic about programming and solving problems above all else.
I'm honestly curious how many people that visit HN today even know what a "y combinator" is, and I have a pretty reasonable guess as to how many have implemented it for fun (though probably the applicative order version).
They are on LinkedIn now posting hustle bro posts about how they wake up at 4 am for yoga while Claude Code generates everything for them.
"LLM evangelists - are you willing to admit that you just might not be that good at programming computers?"
The people who were the best at something don't necessarily be the best at a new paradigm. Unlearning some principles and learning new ones might be painful exercise for some masters.
Military history has shown that the masters of the new wave are not necessarily the masters of the previous wave we see the rise and downfall of several civilizations from Roman to Greek for being too sure of their old methods and old military equipments and strategy.
I have been through this before (wherever/whenever the money seems to flow) - databases are bad, you should use couchbase etc, I was a db expert, the people advocating weren't, but they were very loud. The many, many evangelistic web development alternatives that come and go, all very loud. Now the latest - LLM's, like couchbase et al they have their place but the evangelists are not having any of it.
I work a lot with doctors (writing software for them), I am very envious of their system of specialisation, eg this dude is such and such a specialist - he knows about it, listen to him. IT seems to be anyone who talks the loudest has a podium, separating the wheat from the chaff is difficult. One day we will have a system of qualifications I hope, but it seems a long way off.
I feel no strong need to convince others. I've been seeing major productivity boosts for myself and others since Sonnet 3.5. Maybe for certain kinds of projects and use cases it's less good, maybe they're not using it well; I dunno. I do think a lot of these people probably will be left behind if they don't adopt it within the next 3 years, but that's not really my problem.
What's there to be left behind on? That's like arguing people who stick to driving cars with manual transmissions are going to get left behind when buses "inevitably get good."
The whole point of the AI coding thing is that it lets inexperienced people create software. What skill moat are you building that a skilled software developer won't be able to pick up in 20 minutes?
Your analogy is the wrong way around :)
Everyone now is driving automatic, LLMs are the manual transmission in a classic car with "character".
Yes, anyone can step into one, start it and maybe get there, but the transmission and engine will make strange noises all the way and most people just stick to the first gear because the second gear needs this weird wiggle and a trick with the gas pedal to engage properly.
Using (agentic) LLMs as coding assistants is a skill that (at the moment) can't really be taught as it's not deterministic and based a lot on feels and getting the hang of different base models (gemini, sonnet/opus, gpt, GLM etc). The only way to really learn is by doing.
Yes, anyone can start up Google Antigravity or whatever and say "build me a todo app" and you'll get one. That's just the first gear.
> "...these people probably will be left behind if they don't adopt it..."
And there it is, the insecure evangelism.
To be fair to both sides, it really is hard to tell if we're in the world of
"you'll be left behind if you don't learn crypto" with crypto
or
"you'll be left behind if you don't learn how to drive" with cars
One of those statements is made in good faith, and the other is made out of insecurity. But we'll probably only really be able to tell looking backwards.
As an aside, both those statements were wrong. People who learned to drive well after cars were widely adopted were at no particular disadvantage when and if they decided to adopt the technology. You can see this is true by noting that at this point, no one alive learned to drive when cars first came out.
I don't really see the point in worrying about prompting, and agents and mcps, and skills and all this as a skill to learn. They're trivial if you're already a developer and if it gets good enough that you don't need to know software engineering, there's not going to be anything to learn. Setting it up is a subset of software engineering so it will be able to do that its self once it can solve problems reliably without someone to understand what's going on to check it.
> But doing "prompt-driven development" or "vibe coding" with an Agentic LLM was an incredibly disapointing experience for me. It required an immense amount of baby sitting, for small code changes, made slowly, which were often wrong. All the while I sat there feeling dumber and dumber, as my tokens drained away.
Yeah I find they are useful for large sweeping changes, introducing new features and stuff, mostly because they write a lot of the boilerplate, granted with some errors. But for small fiddly changes they suck, you will have a much easier time doing these changes your self.
"I find LLMs useful as a sort of digital clerk - searching the web for me, finding documentation, looking up algorithms. I even find them useful1 in a limited coding capacity; with a small context and clear guidelines."
I am curious why the author doesn't think this saves them time (i.e. makes them more productive).
I never had terribly high output as a programmer. I certainly think LLMs have helped increased the amount of code that I can write, net total, in a year. Not to superhuman levels or even super-me levels, just me++.
But, I think the total time spent producing code has gone down to a fraction and has allowed me more time to spend thinking about what my code is meant to solve.
I wonder about two things: 1. maybe added productivity isn't going to be found in total code produced, because there is a limit on how much useful code can be produced that is based on external factors 2. do some devs look at the output of an LLM and "get the ick" because they didn't write it and LLM-code is often more verbose and "ugly", even though it may work? (this is a total supposition and not an accusation in any way. i also understand that poorly thought out, overly verbose code comes with problems over time)
> "get the ick"
> even though it may work?
The first of those is about taste, and it's real, and engineers with bad taste write unstable buggy systems.
The second of those is about priority. If all you want is functional code, any old thing will do. That's what I do for one-off scripts. But if you plan to support the code at 2am when exposed to production requests on the internet, you need to understand it, which is about legibility and coherence.
I hope you do have taste, and I hope you value more than simple "it works" tests. But it might be worth looking there for why some struggle with LLM output.
For what it's worth, I use coding agents all the time, but almost never accept their output verbatim outside of boilerplate code.
It seems you find LoC as a measure of productivity. This would answer your question as to why the author does not find it makes them more productive. If total output increases, but quality decreases (which in terms of code means more bugs) then has productivity increased or has it stayed the same?
To answer my own question, if you can pump out features faster but turn around and spend more time on bugs than you do previously then your productivity is likely net neutral.
There is a reason LoC as a measure of productivity has been shunned from the industry for many, many years.
I didn't mean to imply LoC as a measurement of productivity. What I really mean is more "amount of useful code produced to a level the human-using-the-llm determines to be useful".
To try and give an example, say that you want to make a module that transforms some data and you ask the LLM to do it. It generates a module with tons of single-layer if-else branches with a huge LoC. Maybe one human dev looks at it and says, "great this solves my problem and the LoC and verbosity isn't an issue even though it is ugly". Maybe the second looks at it and says, "there's definitely some abstraction I can find to make this easier to understand and build on top of."
Depending on the scenario and context, either of them could be correct.
LoC is a terrible metric for comparing productivity of different developers, even before you get to Goodhart's Law.
OTOH, for a given developer to implement a given feature in a given system, at the end of the day, some amount of code has to be written.
If a particular developer finds that AI lets him write code comparable to what he would have written, in lieu of the code he would have written, but faster than he can do it alone, then looking at lines written might actually be meaningful, just in that context.
I feel like it makes me more productive, but I am not even sure it does even with my light usage. How do we even measure it?
I also feel like it makes me more productive but measuring software engineering productivity is famously difficult. If there was an easy way to measure it, managers at bigco would have employed it with abandon years ago.
I feel any LLM discussion without mentioning concrete tooling choices are unproductive. Toolings are evolving so fast, many statements that were accurate two months ago are simply wrong today.
It's a lot different to "vibe code" by copypasting crap from a browser window to an editor vs using an agentic LLM with full access to the source and tools to search for documentation, run scripts etc.
And one major thing is language.
Some languages (Rust, React) are so complex and nuanced that LLMs struggle with them - as do humans. Agentic LLMs will eventually solve the problem you've given them but the solution might be a bit wonky.
Compare that to LLMs writing Python or Go. With Go there's just one way to write a loop, it can't get confused with that. The way to write and format the language has been exactly the same since the beginning.
Same with Python, it's pretty lenient on how you write it (objects vs functional) but there are well-estabilished standards on how to do things and it's an old language (34 years btw). Most of Python 2.x is still valid Python 3.
I think it goes further than this. Some people - some developers, even - do not _like_ programming computers. In fact, many hate it. Those people welcome the LLM agent stuff because it delivers the end product without going through the necessary pain (from their pov) of programming.
While I believe this may be true, there are also just people that get more reward from building than from the act of writing code. That doesn't mean they hate writing code, but that the building comes first. I count myself in that camp.
If I can build better/faster with reasonably equal quality, I'll trade off the joy of programming for the joy of more building, of more high level problem solving and thinking, etc.
I've also seen the opposite: those that derive more joy from the programming and the cool engineering than from the product. And you see the opposite behavior from them, of course--such as selecting a solution that's cool and novel to build, rather than the simple, boring, but better alternative.
I often find this type of engineer rather frustrating to work with, and coincidentally, they seem to be the most anti-AI type I've encountered.
Yeah, that's pretty much what I think too. I'm much more of the latter type you mention, but I think I have the enough acumen to be practical most times.
Its always been the case that engineers come in many flavors, some more and some less business-inclined. The difference with AI imo is that it will (or already is) putting its trillion-dollar finger on the scale, such that there is less patience and space for people like me, and more for people like you.
I, for one, love programming (and passionately hate AI).
Note I also think AI is bad for philosophical/ethical reasons.
> It required an immense amount of baby sitting, for small code changes, made slowly, which were often wrong.
Can’t speak for others but that’s not what I’d understand (or do) as vibecoding. If you’re babysitting it every inch of the way then yeah sure I can see how it might not be productive relative to doing it yourself.
If you’re constantly fighting the LLM because you have a very specific notion of what each line should look like it won’t be a good time.
Better to spec out some assumptions, some desired outcomes, tech to be used, maybe the core data structure, ask the llm what else it needs to connect the dots, add that and then let it go
This is also bad evangelism, but on opposite side.
Just because LLMs don't work for you outside of vibe-coding, doesn't mean it's the same for everyone.
> LLM evangelists - are you willing to admit that you just might not be that good at programming computers?
Productive usage of LLMs in large scale projects become viable with excellent engineering (tests, patterns, documentation, clean code) so perhaps that question should also be asked to yourself.
I think you should read the article again, because this comment is a straw man vis-a-vis the article.
Is it?
The article starts from the premise that LLMs are only good for vibe-coding.
No it doesn't.
It starts from the premise that the author finds LLMs are good for limited, simple tasks with small contexts and clearly defined guidelines, and specifically not good for vibe-coding.
And the author literally mentions that they aren't making universal claims about LLMs, but just speaking from personal experience.
You're offering a very generous interpretation. To the point of extrapolating what's written. Allow me to exemplify:
> I genuinely don't mind if other people vibe code. Go for it!
> But that is not enough for the vocal proponents. It's the future!
The author is okay for others to voice their positive opinion about LLMs as long as it is limited to vibe coding.
It starts defining a gatekeeping threshold of what level of positive opinion is acceptable for others to have, according to the author.
Nothing in the text you quoted implies anything of the sort, and you're moving the goalposts.
Good day.
> That doing this is it's own skill and I have not spent enough time with it.
Yeah, this.
I sucked (still sucks?) at it too, I spent countless hours correcting them. And throwing away hours of "work" they made, And even had them nuking the workplace a couple times (thankfully, they were siloed). I still feel like I'm wasting too much time way too often and trying new things constantly.
But I always thought I can learn and improve on this tool and its associated ecosystem as much as the other programming tools and languages and frameworks I learned over the years.
I tend to share the sentiment of the author.
I think that coding assistants tend to quite good as long as what you ask is close to the training data.
Anything novel and the quality if falling off rapidly.
So, if you are like Antirez and ask for a Linenoize improvement that has already be seen many times by the LLM at training time, the result will seem magical, but that is largely an illusion, IMO.
But how many times a week are you truely doing something novel? A thing that nobody in the world (or the LLM training data) has done before?
> You see a lot of accomplished, prominent developers claiming they are more productive without it.
Demonstrably impossible if you’re actually properly trying to use them in non-esoteric domains. I challenge anyone to very honestly showcase a non-esoteric domain in which opus4.5 does not make even the most experienced developer more productive.
How would one set this sort of test up? I surely have example domains where LLMs routinely do poorly (for example, custom bazel rules and workspaces), but what would constitute a "showcase" here?
To change my mind I’ll be satisfied with a thorough description of the domain and ideally a theory on why it does poorly in that domain. But we’re not talking LLMs here, we’re talking opus4.5 specifically.
A theory besides... not enough training data? Is it even possible to formulate a coherent theory about this? I'm talking about customizing a widely-used build system, not exactly state-of-the-art cryptography. What could I possibly say that you wouldn't counter with "skill issue" (which goes back to the author's point)?
If you say it's demonstrably impossible that someone can't be made more productive with opus4.5, then it should probably be up to you to demonstrate impossibility.
How could it possibly be a skill issue? Have you tried in earnest to use opus4.5 for the problem you’re trying to solve?
Not enough training data couldn’t be the problem - Bazel is not an esoteric domain. Unless you’re trying to do something esoteric.
"I am still willing to admit I am wrong. That I'm not holding the GPS properly. That navigating with real-time satellite data is its own skill and I have not spent enough time with it. I have changed how I get around before, and I'm sure I will do so again.
Map-reading evangelists, are you willing to admit that you just might not be that good at driving a car? Maybe you once were. Maybe you never were."
I remember a similarly aggressive evangelism about self-driving cars several years ago. I suppose it's not so pleasant, when you feel like you've seen a prophetic glimpse of a brilliant future, to deal with skeptics who don't understand your vision and refuse to give your predictions the credit they deserve.
Of course we need a few people to get wildly overexcited about new possibilities, so they will go make all the early mistakes which show the rest of us what the new thing can and cannot actually do; likewise, we need most of us to feel skeptical and stick to what already works, so we don't all run off a cliff together by mistake.
I like OP's representation, but I feel like a lot of people arent saying 'LLMs are the bomb dot com _right now_' (though some are), but rather the trend is evident: these things will keep getting better, and the writing is on the wall.
Personally I think the rate of improvement will plateau: in my experience software inevitably becomes less about tech and more about the interpersonal human soup of negotiating requirements, needs, contradiction, and feedback loops, at lot of which is not signal accessible to a text-in-text-out engine.
It "becomes"? In a lot of areas, particularly enterprise, business stuff, it had been mostly about all of these things for decades.
"becomes" as in the locus of focus as my career progresses. Vibes-wise, i think we agree / are saying the same thing.
I just want some externally verifiable numbers. If AI is a 10x improvement, we should be seeing new operating systems. If it’s 5x we should see new game engines. If it’s 2x we should see massive amounts of new features for popular open source projects.
If it’s less than that, then it’s more like adding syntax highlighting or moving from Java to Ruby on Rails. Both of those were nice, but people weren’t breathlessly shouting about being left behind.
I dont think that will happen, Not at least now. I dont even think AI would be suitable for creating an OS or a game engine. What seems to be increasing is no of saas products asking for subscriptions and most of the productivity, if any, is being used to add AI features to the apps.
Until some days / weeks ago, LLM's for coding was more hype than actually real code producing. That is gone now. They clearly leveled up, things will not be the same anymore. And of course this is not just for coding, this is just the beginning. A month ago it really seemed that the models were hitting a complexity wall and that the architecture would need to be improved. Not anymore.
I have seen people say something along these lines what feels like every month for the past year.
It's different this time. You can see that at the same time, they are finally, definitely solving hard mathematical problems. They passed the phase of just being like good search engines to being actual generators of new data from their generalizations. I can give you a simple example. Any code they generated before brought frustration and they would loop with feedback. Now they actually produce "human level" code.
> It's different this time.
they said this every time too
And they’re right every single time. It’s only getting crazier and crazier. Look around. Pay attention.
This is a fun piece to dissect because it's self-aware about being uncharitable, yet still commits the very sin it's criticizing.
The author's central complaint is that LLM evangelists dismiss skeptics with psychological speculation ("you're afraid of being irrelevant"). Their response? Psychological speculation ("you're projecting insecurity about your coding skills").
This is tu quoque dressed up as insight. Fighting unfounded psychoanalysis with unfounded psychoanalysis doesn't refute anything. It just levels the playing field of bad arguments.
The author gestures at this with "I am still willing to admit I am wrong" but the bulk of the piece is vibes-based counter-psychoanalysis, not engagement with evidence.
It's a well-written "no u" that mistakes self-awareness ("I know this isn't charitable") for self-correction.
Whatever standard of code quality you want, if it hasn't reached it yet, it will get there very soon.
if A.I maximalist gospel was true - we would see a company raising $10M Series A | Seed (these days)
spend 60% on A.I, 30% on Humans and 10% on operations but I can bet you my sole penny that's not happening - so we know someone is tryna sell us a polished turd as a diamond
At no point in history has humanity ever cut back on spending after some constraint got alleviated. Exact opposite, we always ramp spending up to chase new possibilities.
If A.I maximalism gospel was true we would see companies raising absurd seed and A rounds in record numbers. Which is exactly what we’re seeing
Whatever your personal feeling, judgement, or conviction on this matter; do not dismiss the other side because of a couple wingnuts saying crazy stuff (you can find them on both extremes as well as the middle). Stay curious as to why people have their own conviction, and seek the truth!
What I really like about LLMs is that you can do pair programming without having to deal with humans.
I see it only as a threat to those who have a deep hook into their role as a SWE.
If as a SWE you see the oncoming change and adapt to it, no issue.
If as a SWE you see the enablement of LLMs as an existential threat, then you will find many issues, and you will fail to adapt and have all kind of issues related to it.
If a company wants to cut a lot of SWE it wouldn't matter if you have adapted or not, they will just cut as much as they can. How can you adapt more than another SWE? These tools seem easy to learn and there isn't a massive learning curve if you're a programmer. I wonder if changing to a different role would work but I would be skeptical, because after SWEs they will try to cut the other jobs and if there is no one at the bottom, middle management doesn't make sense.
Read the article. The author wants LLMs to work for them, but they don’t. You’re confusing results and intention.
I see how cool and powerful they're getting, but agree there is a huge insecurity element in the evangelism. Everyone wants to be seen as the one who will get a seat running the llms when the music stops playing.
> LLM evangelists - are you willing to admit that you just might not be that good at programming computers? Maybe you once were. Maybe you never were.
Or maybe the author is bad at programming AND bad at agentic coding.
That’s more likely than the possibility that all llm evangelists are terrible coders.
I see a lot more insecurity from people who refuse to use AI coding tools. My teammates amd I use this stuff all the time, and it's not making a statement, it's just an easier path sometimes.
I somewhat agree with this poster. However, I think the unfortunate reality of programming for money is that a mediocre programmer that pumps out millions of lines of slop that seems to drive the business forward and manages to hide disastrous bugs until after the contract / promotion cycle is over will get further ahead than the more competent programmer that delivers better, less buggy, less spaghetti code.
Most of us are paid to solve problems and deliver features, not craft the most perfect code known to man.
If the slop-o-matic next to you is delivering 5 features a week without tripping up QA and you do one every two weeks - which one will the company pick when layoffs hit again?
I mean, isn't driving the business forward really what matters (outside of academia, open source, and other such endeavors). We live in a hyper competitive market. All else being equal, if company A can produce "millions of lines of slop", constantly living on the knife-edge of disaster but not falling over it, they will beat company B that artificially slows themselves down. Up until the point company A implodes, but that's not necessarily a given if pre-LLM companies are any indication.
Sounds like you should go bundle sub-prime mortgages into some complex securities, if you like intentionally living on the knife's edge of disaster.
Huh? Where did I say that's what I like? I'm just trying to discuss for discussion's sake. Personally, I want a world that rewards the people who put their thought, care, and craftsmanship into something more than those that don't. In order to live in that world, I think we need to discuss the parts (maybe the whole) that don't and why that might be.
don't bother. Your parent commenter is writing some loaded comments in this post.
This is not reality for most companies. Some have billions in bank but still produce slop. Its because their internal systems rewards slop.
I don't get who is saying this dreaded "you'll be left behind." The only place I see that is from straight-up slop accounts in the Twitter algo feed. Surely you're not letting those people make you feel bad.
> You see a lot of accomplished, prominent developers claiming they are more productive without it.
You also see a lot of accomplished, prominent developers claiming they are more productive with it, so I don't know what this is supposed to prove. The inverse argument is just as easy to make and just as spurious.
I'm dubbing this "podcast driven development" because so many of them aren't building things to build things, they just want to _have built something_ so they can go on podcasts and talk about how great it is.
For what it's worth, I think most of them are genuine when they say they're seeing 10X gains,they just went from, like, a 0.01X engineer (centi-swe) to to a 0.1X engineer (deci-swe).
ITT: A bunch of people who think they're god's gift to Earth.
> You tried agentic coding. You realised it was better at programming than you are.
We need to drop this competition paradigm ASAP.
Evangelism of a new technique and tool for doing work is insecure? I agree it’s been oversold but it’s natural to be pretty excited about the tech.
"LLM evangelists - are you willing to admit that you just might not be that good at programming computers? Maybe you once were. Maybe you never were."
lol. is this supposed to be like some sort of "gotcha"! yes? like maybe i am a really shitty programmer and always just wanted to hack things together. what it has allowed me to do is prevent burnout to some extent, outsource the "boring" parts and getting back to building things i like.
also getting tired of these extreme takes but whatever, it's #1 so mission accomplished. llms are neither this or that. just another tool in the toolbox, one that has been frustrating in some contexts and a godsend in others and part of that process is figuring out where it excels and doesn't.
Not really.
I use LLMs for things I am not good at. But I also know I am not good at them.
No one is good at everything. 100% fine.
hmm, maybe you are not as good at using llms as you think then? lol jk.
i mean if you have imposter syndrome then this feeling will always be prevalent. how do you know what you are good at or not? i might be competent enough to have progressed this far in my career as in "results", but comparison to people i consider "good" devs always puts in that doubt.
i guess it strikes a chord when someone in the same breath of claiming to be open minded makes a backhand comment where people who like llms might just must be a shitty programmer or whatever. i get the point, but that line doesn't quite land the way you think it does.
> It's projection. Their evangelism is born of insecurity.
It's fear, but of different kind. Those who are most aggressive and pushy about it are those who invested too much [someone else's] money in it and are scared angry investors will come for their hides when reality won't match their expectations.
I don't mind weighing in as someone who could fairly be categorized as both an LLM evangelist and "not an experienced dev".
It's a lot like why I've been bullish on Tesla's approach to FSD even as someone who owned an AP1 vehicle that objectively was NOT "self-driving" in any sense of the word: it's less about where the technology is right now, or even the speed the technology is currently improving at, and more about how the technology is now present to enable acceleration in the rate of improvement of performance, paired with the reality of us observing exactly that. Like FSD V12 to V14, the last several years in AI can only be characterized as an unprecedented rate of improvement, very much like scientific advancement throughout human society. It took us millions of years to evolve into humans. Hundreds of thousands to develop language. Tens of thousands to develop writing. Thousands to develop the printing press. Hundreds to develop typewriters. Decades to develop computers. Years to go from the 8086 to the modern workstations of today. The time horizon of tasks AI agents can now reliably perform is now doubling every 4 months, per METR.
Do frontier models know more than human experts in all domains right now? Absolutely not. But they already know far more than any individual human expert outside that human's domain(s) of expertise.
I've been passionate about technology for nearly two decades, working in the technology industry for close to a decade. I'm a security guy, not a dev. I have over half a dozen CVEs and countless private vuln disclosures. I can and do write code myself - I've been writing scripts for various network tasks for a decade before ChatGPT ever came into existence. That said, it absolutely is a better dev than me. But specialized harnesses paired with frontier models are also better security engineers than I am, dollar for dollar versus my cost. They're better pentesters than me, for the relative costs. These statements were not true at all without accounting for cost two years ago. Two years from now, I am fully expecting them to just be outright better at security engineering, pentesting, SCA than I am, without accounting for cost, yet I also expect they will cost less then than they do now.
A year ago, OpenAI's o1 was still almost brand new, test-time compute was this revolutionary new idea. Everyone thought you needed tens of billions to train a model as good as o1, it was still a week before Deepseek released R1.
Now, o1's price/performance seems like a distant bad dream. I had always joked that one quarter in tech saw as much change as like 1 year in "the real world". For AI, it feels more like we're seeing more change every month than we do every year in "the real world", and I'd bet on that accelerating, too.
I don't think experienced devs still preferring to architect and write code themselves are coping at all. I still have to fix bugs in AI-generated code myself. But I do think it's short sighted to not look at the trajectory and see the writing on the wall over the next 5 years.
Stanford's $18/hr pentester that outperforms 9/10 humans should have every pentester figuring out what they're going to be doing when it doubles in performance and halves in cost again over the next year, just like human Uber drivers should be reading Motortrend's (historically a vocal critic of Tesla and FSD) 2026 Best Driver Assistance System and figuring out what they're going to do next. Experienced devs should be looking at how quickly we came from text-davinci-003 to Opus 4.5 and considering what their economic utility will look like in 2030.
5 years? That seems generous. We are being threatened this summer (in some companies its gonna be even earlier)
Yeah, I'm being a little generous/conservative here, but also, that 2030 estimate is more along the lines of the "everyone unambiguously understands AI is better than the experts in their respective domains", not for the much sooner "it becomes more economically viable to have AI devs than human devs".
See the same thing in the bitcoin space. If you ask them to explain the value to you, you're a moronic, behind-the-times, luddite boomer who just doesn't understand. Not to mention poor!
I'll remain skeptical and let the technology speak for itself, if it ever does.
mitchellh talked about how he vibe coded the one off visualization code for some blog post of his recently, and he seems like a fairly good programmer
For good craftspeople bad tools are still tools. For bad craftspeople tools, good ones and bad ones, are just a way to produce more crap.
Can we just agree that both the pro- and anti-llm faction mostly contribute noise? And go back to discuss actual achievements?
It's trivial to share coding sessions, be they horrific or great. Without those, you're hot air on the internet, independent of whatever specific opinions on LLMs you voice.
Damn even reading that title shows how dumb i am !!
LLMs are really great at copy/pasting answers from stack overflow and fitting them to work in a given system. If your work is outside what is answerable on stack overflow you're going to end up fighting the results constantly.
Front end pages like a user settings page? Done. One shottable.
Nuanced data migration problems specific to your stack? You're going to be yelling at the agent.
> LLM evangelists - are you willing to admit that you just might not be that good at programming computers? Maybe you once were. Maybe you never were.
A bit harsh considering that many of us used knowledge bases like SO for so long to figure out new problems that we were confronting.
> Front end pages like a user settings page? Done. One shottable.
This is only one shottable if you are high paced startup or you don't care enough. In real world software, you would need to make it accessible, store data in a complaint way, hook up translations, make sure all inputs are validated and do some usability testing.
From my heavy experience using every frontier model for a year now, LLMs are actually probably much, much better at nuanced data migration problems specific to your stack than at a frontend user settings page. (Though still pretty good at both. And the user settings page will work, sure.)
Your experience that "AI coding is bad" will match your belief that "AI coding is bad".
Maybe get an LLM to summarise the article for you?
"And it made me think - why are these people so insistent, and hostile? Why can't they live and let live? Why do they need to convince the rest of us?"
Same could be said about the anti-AI crowd.
I'm glad the author made the distinction that he's talking about LLMs, though, because far too many people these days like to shout from the rooftops about all AI being bad, totally ignoring (willfully or otherwise) important areas it's being used in like cancer research.