Am I spending too much time on HN or is every post/comment section filled with this same narrative? Basically, LLMs are exciting but they produce messy code for which the dev feels no ownership. Managing a codebase written by an LLM is difficult because you have not cognitively loaded the entire thing into your head as you do with code written yourself. They're okay for one-off scripts or projects you do not intend to maintain.
This is blog post/comment section summary encountered many times per day.
The other side of it is people who seem to have 'gotten it' and can dispatch multiple agents to plan/execute/merge changes across a project and want to tell you how awesome their workflow is without actually showing any code.
I think you described it much more succinctly than most people do. It's been my exact experience as well. The LLM can develop much faster than I can build a mental model. It's very easy to get to a point where you don't know what's going on, a bunch of bugs have been introduced and you can't easily fix them or refactor because you're essentially the new guy on your own project. I find myself adjusting by committing code very frequently and periodically asking the LLM to explain it to me. I often ask the LLM to confirm things are working the way it says they are and it tends to find its own bugs that way.
I use an LLM primarily for smaller, focused data analysis tasks so it's possible to move fast and still stay reasonably on top of things if I'm even a little bit careful. I think it would be really easy to trash a large code base in a hurry without some discipline and skill in using LLM. I'm finding that developing prompts, managing context, controlling pace, staying organized and being able to effectively review the LLM's work are required skills for LLM-assisted coding. Nobody teaches this stuff yet so you have to learn it the hard way.
Now that I have a taste, I wouldn't give it up. There's so much tedious stuff I just don't want to have to do myself that I can offload to the LLM. After more than 20 years doing this, I don't have the same level of patience anymore. There are also situations where I know conceptually what I want to accomplish but may not know exactly how to implement it and I love the LLM for that. I can definitely accomplish more in less time than I ever did before.
I don't mind when other programmers use AI, and use it myself. What I mind is the abdication of responsibility for the code or result. I don't think that we should be issuing a disclaimer when we use AI any more than when I used grep to do the log search. If we use it, we own the result of it as a tool and need to treat it as such. Extra important for generated code.
- one big set of users who don't like it because it generates a lot of code and uses its own style of algorithms, and it's a whole lot of unfamiliar code that the user has to load up in their mind - as you said. Too much to comprehend, and quickly overwhelming.
And then to either side
- it unblocks users who simply couldn't have written the code on their own, who aren't even trying to load it into their head. They are now able to make working programs!
- it accelerates users who could have written it on their own, given enough time, but have figured out how to treat it as an army of junior coders, and learned to only maintain the high level algorithm in their head. They are now able to build far larger projects, fast!
> LLMs are exciting but they produce messy code for which the dev feels no ownership. [...] The other side of it is people who seem to have 'gotten it' and can dispatch multiple agents to plan/execute/merge changes across a project
Yup, can confirm, there are indeed people with differing opinions and experience/anecdotes on HN.
> want to tell you how awesome their workflow is without actually showing any code.
You might be having some AI-news-fatigue (I can relate) and missed a few, but there are also people who seem to have gotten it and do want to show code:
Here's one of my non-trivial open source projects where a large portion is AI built: https://github.com/senko/cijene-api (didn't keep stats, I'd eyeball it at conservatively 50% - 80%)
How is that that different than working in a large codebase with 25+ other devs.
My org has 160 engineers working on our e-commerce frontend and middle tiers. I constantly dive into repos and code I have no ownership of. The gitblame shows a contractor who worked here 3 years ago frequently.
Seems LLM does good in small, bad in medium, good again as small modules within big.
This is definitely something I feel is a choice. I've been experimenting quite a bit with AI generated code, and with any code that I intend to publish or maintain I've been very conscious in making the decision that I own the code and that if I'm not entirely happy with the AI generated output I have to fix it (or force the AI to fix it).
Which is a very different way of reviewing code than how you review another humans code, where you make compromises because you're equals.
I think this produces fine code, not particularly quickly but used well probably somewhat quicker (and somewhat higher quality code) than not using AI.
On the flip side on some throwaway experiments and patches to personalize open source products that I have absolutely no intention of upstreaming I've made the decision that the "AI" owns the code, and gone much more down the vibe coding route. This produces unmaintainable sloppy code, but it works, and it takes a lot less work than doing it properly.
I suspect the companies that are trying to force people to use AI are going to get a lot more of the "no human ownership" code than individuals like me experimenting because they think its interesting/fun.
I think they are making me more productive in achieving my targets and worse in my ability to program.
They are exactly like steroids - bigger muscles fast but tons of side effects and everything collapses the moment you stop. Companies don't care because they are more concerned about getting to their targets fast instead of your health.
Another harmful drug for our brain if consumed without moderation. I won't entirely stop using them but I have already started to actively control/focus my usage.
LLMs seem to be really good at reproducing the classic Ball of Mud, that can't really be refactored or understood.
There's a lot of power in creating simple components that interact with other simple components to produce complex functionality. While each component is easy to understand and debug and predict its performance. The trick is to figure out how to decompose your complex problem into these simple components and their interactions.
I suppose once LLMs get really good at that skill, will be when we really won't need developers any more.
This is pretty much the conclusion I've come to as well. It's not good at being an autocomplete for entire chunks of your codebase. You lose the mental model of what is doing what, and exactly where. I prefer to use it as a personalized, faster-iterating StackOverflow. I'll ask it to give me a rundown of a concept I'm not familiar with, or for a general direction to point me in if I'm uncertain of what a good solution would be. Then I'll make the decision, and implement it myself. That workflow has worked out much better for me so far.
One point I haven't seen made elsewhere yet is that LLMs can occasionally make you less productive. If they hallucinate a promising-seeming answer and send you down a path that you wouldn't have gone down otherwise, they can really waste your time. I think on net, they are helpful, especially if you check their sources (which might not always back up what they are saying!). But it's good to keep in mind that sometimes doing it yourself is actually faster.
LLMs have limits. They are super powerful but they can't make the kind of leap humans can. For example, I asked both Claude and Gemini below problem.
"I want to run webserver on Android but it does not allow binding on ports lower than 1000. What are my options?"
Both responded with below solutions
1. Use reverse proxy
2. Root the phone
3. Run on higher port
Even after asking them to rethink they couldn't come up with the solution I was expecting. The solution to this problem is HTTPS RR records[1]. Both models knew about HTTPS RR but couldn't suggest it as a solution. It's only after I included it in their context both agreed it as a possible solution.
I've dialed down a lot as well. The answers I got for my queries were too often plain wrong.
I instead started asking where I might look something up - in what man page, or in which documentation. Then I go read that.
This helps me build a better mental map about where information is found (e.g., in what man page), decreasing both my reliance on search engines, and LLMs in the long run.
LLMs have their uses, but they are just a tool, and an imprecise one at that.
My favorite use case for LLMs in long term software production (they're pretty great at one off stuff since I don't need to maintain) is as an advanced boiler plate generator.
Stuff that can't just be abstracted to a function or class but also require no real thought. Tests are often (depending on what they're testing) in this realm.
I was resistant at first, but I love it. It's reduced the parts of my job that I dislike doing because of how monotonous they are and replaced them with a new fun thing to do - optimizing prompts that get it done for me much faster.
Writing the prompt and reviewing the code is _so_ much faster on tedious simple stuff and it leaves the interesting, though provoking parts of my work for me to do.
I’m using ChatGPT (enterprise version paid by my employer) quite a lot lately, and I find it a useful tool. Here’s what I learned over time.
Don’t feed many pages of code to AI, it works best for isolated functions or small classes with little dependencies.
In 10% of cases when I ask to generate or complete code, the quality of the code is less than ideal but fixable with extra instructions. In 25% of cases, the quality of generated code is bad and remains so even after telling it what’s wrong and how to fix. When it happens, I simply ignore the AI output and do something else reasonable.
Apart from writing code, I find it useful at reviewing new code I wrote. Half of the comments are crap and should be ignored. Some others are questionable. However, I remember a few times when the AI identified actual bugs or other important issues in my code, and proposed fixes. Again, don’t copy-paste many pages at once, do it piecewise.
For some niche areas (examples are HLSL shaders, or C++ with SIMD intrinsics) the AI is pretty much useless, probably was not enough training data available.
Overall, I believe ChatGPT improved my code quality. Not only as a result of reviews, comments, or generated codes, but also my piecewise copy-pasting workflow improved overall architecture by splitting the codebase into classes/functions/modules/interfaces each doing their own thing.
I personally believe it's a mistake to invite AI into your editor/IDE. Keep it separate to the browser, keep discrete, concise question and answer threads. Copy and paste whenever it delivers some gold (that comes with all the copy-pasta dangers, I know - oh, don't I know it!)
It's important to always maintain the developer role, don't ever surrender it.
The dichotomy between the people who are "orchestrating" agents to build software and the people experiencing this less than ideal outcomes from LLMs is fascinating.
I don't think LLM for coding productivity is all hype but I think for the people who "see the magic" there are many illusions here similar to those who fall prey to an MLM pitch.
You can see all the claims aren't necessarily unfounded, but the lack of guaranteed reproducibility leaves the door open for many caveats in favor of belief for the believer and cynicism for everybody else.
For the believers if it's not working for one person, it's a skill issue related to providing the best prompt, the right rules, the perfect context and so forth. At what point is this a roundabout way of doing it yourself anyway?
Over the last couple months I've gone from highly skeptical to a regular user (Copilot in my case). Two big things changed: First, I figured out that only some models are good enough to do the tasks I want (Claude Sonnet 3.7 and 4 out of everything I've tested). Second, it takes some infrastructure. I've added around 1000 words of additional instructions telling Copilot how to operate, and that's on top of tests (which you should have anyway) and 3rd party documentation. I haven't tried the fleet-of-agents thing, one VS Code instance is enough and I want to understand the changes in detail.
Edit: In concrete terms the workflow is to allow Copilot to make changes, see what's broken, fix those, review the diff against the goal, simplify the changes, etc, and repeat, until the overall task is done. All hands off.
Can relate. I've also shifted towards generating small snippets of code using LLMs, giving them a glance, and asking to write unit tests for them. And then I review the unit tests carefully. But integrating the snippets together into the bigger system, I always do that myself. LLMs can do it sometimes but when it becomes big enough that it can't fit into the context window, then it's a real issue because now LLMs doesn't know what's going on and neither do you. So, I'll advise you to use LLMs to generate tedious bits of code but you must have the overall architecture committed into your memory as well so that when AI messes up, at least you have some clue about how to fix it.
I don’t have it write of my Python firmware or Elixir backend stuff.
What I do let it rough in is web front end stuff. I view the need for and utility of LLMs in the html/css/tailwind/js space as an indictment of complexity and inconsistency. It’s amazing that the web front end stuff has just evolved over the years, organically morphing from one thing to another, but a sound well engineered simple-is-best set of software it is not. And in a world where my efforts will probably work in most browser contexts, no surprise that I’m willing to mix in a tool that will make results that will probably work. A mess is still a mess.
I wonder if this is as good as LLMs can get, or if this is a transition period between LLM as an assistant, and LLM as a compiler. Where in the latter world we don’t need to care about the code because we just care about the features. We let the LLM deal with the code and we deal with the context, treating code more like a binary. In that world, I’d bet code gets the same treatment as memory management today, where only a small percent of people need to manage it directly and most of us assume it happens correctly enough to not worry about it.
Personally, I follow the simple rule: "I type every single character myself. The AI/agent/etc offers inspiration." It's an effective balance between embracing what the tech can do (I'm dialing up my usage) and maintaining my personal connection to the code (I'm having fun + keeping things in my head).
My point of view: LLMs should be taken as a tool, not as a source of wisdom. I know someone who likes to answer people-related questions through a LLM. (E.g.: "What should this person do?" "What should we know about you?" etc.) More than once, this leads to him getting into a state of limbo when he tries to explain what he means with what he wrote. It feels a bit wild - a bit like back in school, when the guy who copied your homework, is forced to explain how he ended up with the solution.
two weeks ago I started heavily using Codex (I have 20y+ dev xp).
At first I was very enthusiastic and thought Codex is helping me multiplex myself. But you actually spend so much time trying to explain Codex the most obvious things and it gets them wrong all the time in some kind of nuanced way that in the end you spend more time doing things via Codex than by hand.
So I also dialed back Codex usage and got back to doing many more things by hand again because its just so much faster and much more predictable time-wise.
I like Zed's way of doing stuff (Ask mode). Just ask it a question and let it go through the whole thing. I still haven't figured out how to form the question so it doesn't just rail off and start implementing code. I don't care about code, I ask it to either validate my mental model or improve it
The way that LLMs are used/are encouraged by business right now is evidence that they are mostly being pushed by people who don't understand software. I don't know very many actual software engineers who advocate for vibe-coding or using LLMs this way. I know ton of engineers who advocate using them as helpful tools, myself included (in fact I changed my opinion on it as their capabilities grew, and I'll continue to do so).
Every tool is just a tool. No tool is a solution. Until and unless we hit AGI, only the human brain is that.
The use case for LLM assistance that provides value for me is solving obscure lint or static analysis warnings and errors.
I take the message, provide the surrounding code, and it gives me a few approaches to solve them. More than half the time, the resolution is there and I can copy the relevant bit in the literal verbiage. (The other times it's garbage but at least I can see that this is going to require some AI—Actual Intelligence.)
Really appreciated this take, hits close to home. I’ve found LLMs great for speed and scaffolding, but the more I rely on them, the more I notice my problem-solving instincts getting duller. There’s a tradeoff between convenience and understanding, and it’s easy to miss until something breaks. Still bullish on using AI for exploring ideas or clarifying intent, but I’m trying to be more intentional about when I lean in vs. when I slow down and think things through myself.
I've been allowing LLMs to do more "background" work for me. Giving me some room to experiment with stuff so that I can come back in 10-15 minutes and see what it's done.
The key things I've come to are that it HAS to be fairly limited. Giving it a big task like refactoring a code base won't work. Giving it an example can help dramatically. If you haven't "trained" it by giving it context or adding your CLAUDE.md file, you'll end up finding it doing things you don't want it to do.
Another great task I've been giving it while I'm working on other things is generating docs for existing features and modules. It is surprisingly good at looking at events and following those events to see where they go and generating diagrams and he like.
These seem like good checkpoints (and valid criticisms) on the road to progress.
But it's also not crazy to think that with LLMs getting smarter (and considerable resources put into making them better at coding), that future versions would clean up and refactor code written by past versions. Correct?
I've found the Cursor autocomplete to be nice, but I've learned to only accept a completion if it's byte for byte what I would've written. With the context of surrounding code it guesses that often enough to be worth the money for me.
I’ve been starting my prompts more and more with the phrase „Let’s brainstorm“.
Really powerful seeing different options, especially based on your codebase.
> I wouldn't give them a big feature again. I'll do very small things like refactoring or a very small-scoped feature.
That really resonates with me. Anything larger often ends badly and I can feel the „tech debt“ building in my head with each minute Copilot is running. I do like the feeling though when you understood a problem already, write a detailed prompt to nudge the AI into the right direction, and it executes just like you wanted. After all, problem solving is why I’m here and writing code is just the vehicle for it.
I've gone the other way and put a lot of effort into figuring out how to best utilize these things. It's a rough learning curve and not trivial, especially given how effortless stuff looks and feels at first.
I think that people are just too quick to assume this is amazing, before it is there. Which doesn't mean it won't get there.
Somehow if I take the best models and agents, most hard coding benchmarks are at below 50% and even swe bench verified is like at 75 maybe 80%. Not 95. Assuming agents just solve most problems is incorrect, despite it being really good at first prototypes.
Also in my experience agents are great to a point and then fall off a cliff. Not gradually. Just the type of errors you get past one point is so diverse, one cannot even explain it.
LLMs give you a power tool after you spent your whole career using hand tools.
A chainsaw and chisel do different things and are made for different situations. It’s great to have chainsaws, no longer must we chop down a giant tree with a chisel.
On the other hand there’s plenty of room in the trade for handcraft. You still need to use that chisel to smooth off the fine edges of your chainsaw work, so your teammates don’t get splinters.
I wish there was a browser addon that worked like ublock but for LLM talk. Like just take it all, every blog post, every announcement, every discussion and wipe it all away. I just want humanity to deal with some of our actual issues, like fascism, war in Europe and the middle east, the centralization of our lines of production, the unfairness in our economies.
Instead we're stuck talking about if the lie machine can fucking code. God.
I find LLMs very efficient in a lot of things except writing big chunks of code that require some organization. They’re good at closing knowlwdge gaps, finding strategies to solve problems, writing code for defined scopes and even reviewing large PRs. However, as others said they can easily alienate you from your own project.
LLMs require fuzzy input and are thus good for fuzzy output, mostly things like recommendations and options. I just do not see a scenario where fuzzy input can lead to absolute, opinionated output unless extremely simple and mostly done before already. Programming, design, writing, etc. all require opinions and an absolute output from the author to be quality.
Using AI definitely makes me more productive. However I spend 80% of my "AI time" fixing mistakes made by the AI or explaining why the solution doesn't work.
I spent today rewriting a cloud function I'd done with the "help" of an LLM.
Looked like dog shit, but worked fine till it hit some edge cases.
Had to break the whole thing down again and pretty much start from scratch.
Ultimately not a bad day's work, and I still had it on for autocomplete on doc-strings and such, but like fuck will I be letting an agent near code I do for money again in the near future.
LLMs save me a lot of time as a software engineer because they save me a ton of time doing either boilerplate work or mundane tasks that are relatively conceptually easy but annoying to actually have to do/type/whatever in an IDE.
But I still more-or-less have to think like a software engineer. That's not going to go away. I have to make sure the code remains clean and well-organized -- which, for example, LLMs can help with, but I have to make precision requests and (most importantly) know specifically what I mean by "clean and well-organized." And I always read through and review any generated code and often tweak the output because at the end of the day I am responsible for the code base and I need to verify quality and I need to be able to answer questions and do all of the usual soft-skill engineering stuff. Etc. Etc.
So do whatever fits your need. I think LLMs are a massive multiplier because I can focus on the actual engineering stuff and automate away a bunch of the boring shit.
But when I read stuff like:
"I lost all my trust in LLMs, so I wouldn't give them a big feature again. I'll do very small things like refactoring or a very small-scoped feature."
I feel like I'm hearing something like, "I decided to build a house! So I hired some house builders and told them to build me a house with three bedrooms and two bathrooms and they wound up building something that was not at all what I wanted! Why didn't they know I really liked high ceilings?"
When I first came across the idea of vibe coding, my first reaction was that this
was taking things too far. Isn't it enough that your LLM can help you do
- autocomplete
- suggest possible solutions to a problem you've taken the time to understand
- helps you spend less time reading documentation and instead help guide your
approach and sometimes even helps you identify obscure apis that could help you get shit done
- help you review your code
- come up with multiple designs for a solutions
- evaluate multiple designs you come up with for trade-offs
- help you understand your problem better and the available apis
- write a prototype of some piece of code
I feel like LLMs are already doing quite a lot. I spend less time rummaging through documentation or trying to remember obscure api's or other pieces of code in a software project. All I need is a strong mental model about the project and how things are done.
There is a lot of obvious heavy lifting that LLMs are doing that I for one am not able to take for granted.
For people facing constraints similar to those in a resource constrained economic environment, the benefits of any technology that helps them spend less time doing
work that doesn't deliver value is immediately visible/obvious/apparent.
It is no longer an argument about whether it is a hype or something, it is more about how best to use it to achieve your goals. Forget the hype. Forget the marketing of AI companies - they have to do that to sell their products - nothing wrong with that. Don't let companies or bloggers set your own expectations of what could or should be done with this piece of tech. Just get on the bandwagon and experiment and find out what is too much. In the end I feel we will all come
from these experiments knowing that LLMs are already doing quite a lot.
TRIVIA
I even came by this article https://www.greptile.com/blog/ai-code-reviews-conflict. That clearly pointed out how LLM reliance can bring both the 10x dev and 1x dev closer to a median of "goodness". So the 10x dev is probably worse and the 1x dev ends up getting better - I'm probably that guy because I tend to mis subtle things in code and copilot review has had my ass for a while now - I haven't had defects like that in a while.
Personally the initial excitement has worn off for me and I am enjoying writing code myself and just using kagi assistant to ask the odd question, mostly research.
When a team mate who bangs on about how we should all be using ai tried to demo it and got things in a bit of a mess, I knew we had peaked.
Too many of us fall into “prompt autopilot” mode—reaching for AI before we think. Your post calls it out beautifully: step back, reclaim the muscle memory of creative problem solving. LLMs should supplement, not substitute. That discipline often separates thoughtful integration from dependency.
The problem with zed's narrative is that because he failed to use it in productive ways he wants to dial it back altogether but its not clear what he has actually attempted and people dogpiling here reminds me of artists who are hostile to AI tools, it doesn't accurately reflect the true state of the marketplace which actually puts a lot of value on successful LLM/AI tool use especially in the context of software development.
If you extrapolate this blog then we shouldn't be having so much success with LLMs, we shouldn't be able to ship product with fewer people, and we should be hiring junior developers.
But the truth of the matter is, especially for folks that work on agents focusing on software development is that we can see a huge tidal shift happening in ways similar to artists, photographers, translators and copywriters have experienced.
The blog sells the idea that LLM is not productive and needs to be dialed down does not tell the whole story. This does not mean I am saying LLM should be used in all scenarios, there are clearly situations where it might not be desirable, but overall the productivity hinderance narrative I repeatedly see on HN isn't convincing and I suspect is highly biased.
Am I spending too much time on HN or is every post/comment section filled with this same narrative? Basically, LLMs are exciting but they produce messy code for which the dev feels no ownership. Managing a codebase written by an LLM is difficult because you have not cognitively loaded the entire thing into your head as you do with code written yourself. They're okay for one-off scripts or projects you do not intend to maintain.
This is blog post/comment section summary encountered many times per day.
The other side of it is people who seem to have 'gotten it' and can dispatch multiple agents to plan/execute/merge changes across a project and want to tell you how awesome their workflow is without actually showing any code.
I think you described it much more succinctly than most people do. It's been my exact experience as well. The LLM can develop much faster than I can build a mental model. It's very easy to get to a point where you don't know what's going on, a bunch of bugs have been introduced and you can't easily fix them or refactor because you're essentially the new guy on your own project. I find myself adjusting by committing code very frequently and periodically asking the LLM to explain it to me. I often ask the LLM to confirm things are working the way it says they are and it tends to find its own bugs that way.
I use an LLM primarily for smaller, focused data analysis tasks so it's possible to move fast and still stay reasonably on top of things if I'm even a little bit careful. I think it would be really easy to trash a large code base in a hurry without some discipline and skill in using LLM. I'm finding that developing prompts, managing context, controlling pace, staying organized and being able to effectively review the LLM's work are required skills for LLM-assisted coding. Nobody teaches this stuff yet so you have to learn it the hard way.
Now that I have a taste, I wouldn't give it up. There's so much tedious stuff I just don't want to have to do myself that I can offload to the LLM. After more than 20 years doing this, I don't have the same level of patience anymore. There are also situations where I know conceptually what I want to accomplish but may not know exactly how to implement it and I love the LLM for that. I can definitely accomplish more in less time than I ever did before.
I think you hit the nail on the head with the mental model part. I really like this method of thinking about programming "Programming as Theory Building" https://gist.github.com/onlurking/fc5c81d18cfce9ff81bc968a7f...
I don't mind when other programmers use AI, and use it myself. What I mind is the abdication of responsibility for the code or result. I don't think that we should be issuing a disclaimer when we use AI any more than when I used grep to do the log search. If we use it, we own the result of it as a tool and need to treat it as such. Extra important for generated code.
It feels like a bell curve:
- one big set of users who don't like it because it generates a lot of code and uses its own style of algorithms, and it's a whole lot of unfamiliar code that the user has to load up in their mind - as you said. Too much to comprehend, and quickly overwhelming.
And then to either side
- it unblocks users who simply couldn't have written the code on their own, who aren't even trying to load it into their head. They are now able to make working programs!
- it accelerates users who could have written it on their own, given enough time, but have figured out how to treat it as an army of junior coders, and learned to only maintain the high level algorithm in their head. They are now able to build far larger projects, fast!
> Am I spending too much time on HN
Likely (as am I).
> LLMs are exciting but they produce messy code for which the dev feels no ownership. [...] The other side of it is people who seem to have 'gotten it' and can dispatch multiple agents to plan/execute/merge changes across a project
Yup, can confirm, there are indeed people with differing opinions and experience/anecdotes on HN.
> want to tell you how awesome their workflow is without actually showing any code.
You might be having some AI-news-fatigue (I can relate) and missed a few, but there are also people who seem to have gotten it and do want to show code:
Armin Ronacher (of Flask, Jinja2, Sentry fame): https://www.youtube.com/watch?v=nfOVgz_omlU (workflow) and https://lucumr.pocoo.org/2025/6/21/my-first-ai-library/ (code)
Here's one of my non-trivial open source projects where a large portion is AI built: https://github.com/senko/cijene-api (didn't keep stats, I'd eyeball it at conservatively 50% - 80%)
How is that that different than working in a large codebase with 25+ other devs.
My org has 160 engineers working on our e-commerce frontend and middle tiers. I constantly dive into repos and code I have no ownership of. The gitblame shows a contractor who worked here 3 years ago frequently.
Seems LLM does good in small, bad in medium, good again as small modules within big.
> for which the dev feels no ownership.
This is definitely something I feel is a choice. I've been experimenting quite a bit with AI generated code, and with any code that I intend to publish or maintain I've been very conscious in making the decision that I own the code and that if I'm not entirely happy with the AI generated output I have to fix it (or force the AI to fix it).
Which is a very different way of reviewing code than how you review another humans code, where you make compromises because you're equals.
I think this produces fine code, not particularly quickly but used well probably somewhat quicker (and somewhat higher quality code) than not using AI.
On the flip side on some throwaway experiments and patches to personalize open source products that I have absolutely no intention of upstreaming I've made the decision that the "AI" owns the code, and gone much more down the vibe coding route. This produces unmaintainable sloppy code, but it works, and it takes a lot less work than doing it properly.
I suspect the companies that are trying to force people to use AI are going to get a lot more of the "no human ownership" code than individuals like me experimenting because they think its interesting/fun.
I think they are making me more productive in achieving my targets and worse in my ability to program.
They are exactly like steroids - bigger muscles fast but tons of side effects and everything collapses the moment you stop. Companies don't care because they are more concerned about getting to their targets fast instead of your health.
Another harmful drug for our brain if consumed without moderation. I won't entirely stop using them but I have already started to actively control/focus my usage.
I think LLMs have made a lot of developers forget the lessons in "Simple Made Easy":
https://www.youtube.com/watch?v=SxdOUGdseq4
LLMs seem to be really good at reproducing the classic Ball of Mud, that can't really be refactored or understood.
There's a lot of power in creating simple components that interact with other simple components to produce complex functionality. While each component is easy to understand and debug and predict its performance. The trick is to figure out how to decompose your complex problem into these simple components and their interactions.
I suppose once LLMs get really good at that skill, will be when we really won't need developers any more.
This is pretty much the conclusion I've come to as well. It's not good at being an autocomplete for entire chunks of your codebase. You lose the mental model of what is doing what, and exactly where. I prefer to use it as a personalized, faster-iterating StackOverflow. I'll ask it to give me a rundown of a concept I'm not familiar with, or for a general direction to point me in if I'm uncertain of what a good solution would be. Then I'll make the decision, and implement it myself. That workflow has worked out much better for me so far.
One point I haven't seen made elsewhere yet is that LLMs can occasionally make you less productive. If they hallucinate a promising-seeming answer and send you down a path that you wouldn't have gone down otherwise, they can really waste your time. I think on net, they are helpful, especially if you check their sources (which might not always back up what they are saying!). But it's good to keep in mind that sometimes doing it yourself is actually faster.
LLMs have limits. They are super powerful but they can't make the kind of leap humans can. For example, I asked both Claude and Gemini below problem.
"I want to run webserver on Android but it does not allow binding on ports lower than 1000. What are my options?"
Both responded with below solutions
1. Use reverse proxy
2. Root the phone
3. Run on higher port
Even after asking them to rethink they couldn't come up with the solution I was expecting. The solution to this problem is HTTPS RR records[1]. Both models knew about HTTPS RR but couldn't suggest it as a solution. It's only after I included it in their context both agreed it as a possible solution.
[1]: https://rohanrd.xyz/posts/hosting-website-on-phone/
I've dialed down a lot as well. The answers I got for my queries were too often plain wrong.
I instead started asking where I might look something up - in what man page, or in which documentation. Then I go read that.
This helps me build a better mental map about where information is found (e.g., in what man page), decreasing both my reliance on search engines, and LLMs in the long run.
LLMs have their uses, but they are just a tool, and an imprecise one at that.
My favorite use case for LLMs in long term software production (they're pretty great at one off stuff since I don't need to maintain) is as an advanced boiler plate generator.
Stuff that can't just be abstracted to a function or class but also require no real thought. Tests are often (depending on what they're testing) in this realm.
I was resistant at first, but I love it. It's reduced the parts of my job that I dislike doing because of how monotonous they are and replaced them with a new fun thing to do - optimizing prompts that get it done for me much faster.
Writing the prompt and reviewing the code is _so_ much faster on tedious simple stuff and it leaves the interesting, though provoking parts of my work for me to do.
I'm realizing that LLMs, for coding in particular but also for many other tasks, are a new version of the fad dieting phenomenon.
People really want a quick, low effort fix that appeals to the energy conserving lizard brain while still promising all the results.
In reality there aren't shortcuts, there's just tradeoffs, and we all realize it eventually.
I’m using ChatGPT (enterprise version paid by my employer) quite a lot lately, and I find it a useful tool. Here’s what I learned over time.
Don’t feed many pages of code to AI, it works best for isolated functions or small classes with little dependencies.
In 10% of cases when I ask to generate or complete code, the quality of the code is less than ideal but fixable with extra instructions. In 25% of cases, the quality of generated code is bad and remains so even after telling it what’s wrong and how to fix. When it happens, I simply ignore the AI output and do something else reasonable.
Apart from writing code, I find it useful at reviewing new code I wrote. Half of the comments are crap and should be ignored. Some others are questionable. However, I remember a few times when the AI identified actual bugs or other important issues in my code, and proposed fixes. Again, don’t copy-paste many pages at once, do it piecewise.
For some niche areas (examples are HLSL shaders, or C++ with SIMD intrinsics) the AI is pretty much useless, probably was not enough training data available.
Overall, I believe ChatGPT improved my code quality. Not only as a result of reviews, comments, or generated codes, but also my piecewise copy-pasting workflow improved overall architecture by splitting the codebase into classes/functions/modules/interfaces each doing their own thing.
I personally believe it's a mistake to invite AI into your editor/IDE. Keep it separate to the browser, keep discrete, concise question and answer threads. Copy and paste whenever it delivers some gold (that comes with all the copy-pasta dangers, I know - oh, don't I know it!)
It's important to always maintain the developer role, don't ever surrender it.
The dichotomy between the people who are "orchestrating" agents to build software and the people experiencing this less than ideal outcomes from LLMs is fascinating.
I don't think LLM for coding productivity is all hype but I think for the people who "see the magic" there are many illusions here similar to those who fall prey to an MLM pitch.
You can see all the claims aren't necessarily unfounded, but the lack of guaranteed reproducibility leaves the door open for many caveats in favor of belief for the believer and cynicism for everybody else.
For the believers if it's not working for one person, it's a skill issue related to providing the best prompt, the right rules, the perfect context and so forth. At what point is this a roundabout way of doing it yourself anyway?
Over the last couple months I've gone from highly skeptical to a regular user (Copilot in my case). Two big things changed: First, I figured out that only some models are good enough to do the tasks I want (Claude Sonnet 3.7 and 4 out of everything I've tested). Second, it takes some infrastructure. I've added around 1000 words of additional instructions telling Copilot how to operate, and that's on top of tests (which you should have anyway) and 3rd party documentation. I haven't tried the fleet-of-agents thing, one VS Code instance is enough and I want to understand the changes in detail.
Edit: In concrete terms the workflow is to allow Copilot to make changes, see what's broken, fix those, review the diff against the goal, simplify the changes, etc, and repeat, until the overall task is done. All hands off.
Can relate. I've also shifted towards generating small snippets of code using LLMs, giving them a glance, and asking to write unit tests for them. And then I review the unit tests carefully. But integrating the snippets together into the bigger system, I always do that myself. LLMs can do it sometimes but when it becomes big enough that it can't fit into the context window, then it's a real issue because now LLMs doesn't know what's going on and neither do you. So, I'll advise you to use LLMs to generate tedious bits of code but you must have the overall architecture committed into your memory as well so that when AI messes up, at least you have some clue about how to fix it.
This parrots much of my own experience.
I don’t have it write of my Python firmware or Elixir backend stuff.
What I do let it rough in is web front end stuff. I view the need for and utility of LLMs in the html/css/tailwind/js space as an indictment of complexity and inconsistency. It’s amazing that the web front end stuff has just evolved over the years, organically morphing from one thing to another, but a sound well engineered simple-is-best set of software it is not. And in a world where my efforts will probably work in most browser contexts, no surprise that I’m willing to mix in a tool that will make results that will probably work. A mess is still a mess.
I wonder if this is as good as LLMs can get, or if this is a transition period between LLM as an assistant, and LLM as a compiler. Where in the latter world we don’t need to care about the code because we just care about the features. We let the LLM deal with the code and we deal with the context, treating code more like a binary. In that world, I’d bet code gets the same treatment as memory management today, where only a small percent of people need to manage it directly and most of us assume it happens correctly enough to not worry about it.
Personally, I follow the simple rule: "I type every single character myself. The AI/agent/etc offers inspiration." It's an effective balance between embracing what the tech can do (I'm dialing up my usage) and maintaining my personal connection to the code (I'm having fun + keeping things in my head).
I wrote about it: https://kamens.com/blog/code-with-ai-the-hard-way
My point of view: LLMs should be taken as a tool, not as a source of wisdom. I know someone who likes to answer people-related questions through a LLM. (E.g.: "What should this person do?" "What should we know about you?" etc.) More than once, this leads to him getting into a state of limbo when he tries to explain what he means with what he wrote. It feels a bit wild - a bit like back in school, when the guy who copied your homework, is forced to explain how he ended up with the solution.
two weeks ago I started heavily using Codex (I have 20y+ dev xp).
At first I was very enthusiastic and thought Codex is helping me multiplex myself. But you actually spend so much time trying to explain Codex the most obvious things and it gets them wrong all the time in some kind of nuanced way that in the end you spend more time doing things via Codex than by hand.
So I also dialed back Codex usage and got back to doing many more things by hand again because its just so much faster and much more predictable time-wise.
I like Zed's way of doing stuff (Ask mode). Just ask it a question and let it go through the whole thing. I still haven't figured out how to form the question so it doesn't just rail off and start implementing code. I don't care about code, I ask it to either validate my mental model or improve it
The way that LLMs are used/are encouraged by business right now is evidence that they are mostly being pushed by people who don't understand software. I don't know very many actual software engineers who advocate for vibe-coding or using LLMs this way. I know ton of engineers who advocate using them as helpful tools, myself included (in fact I changed my opinion on it as their capabilities grew, and I'll continue to do so).
Every tool is just a tool. No tool is a solution. Until and unless we hit AGI, only the human brain is that.
The use case for LLM assistance that provides value for me is solving obscure lint or static analysis warnings and errors.
I take the message, provide the surrounding code, and it gives me a few approaches to solve them. More than half the time, the resolution is there and I can copy the relevant bit in the literal verbiage. (The other times it's garbage but at least I can see that this is going to require some AI—Actual Intelligence.)
Really appreciated this take, hits close to home. I’ve found LLMs great for speed and scaffolding, but the more I rely on them, the more I notice my problem-solving instincts getting duller. There’s a tradeoff between convenience and understanding, and it’s easy to miss until something breaks. Still bullish on using AI for exploring ideas or clarifying intent, but I’m trying to be more intentional about when I lean in vs. when I slow down and think things through myself.
Sounds like he shot for the moon and missed.
I've been allowing LLMs to do more "background" work for me. Giving me some room to experiment with stuff so that I can come back in 10-15 minutes and see what it's done.
The key things I've come to are that it HAS to be fairly limited. Giving it a big task like refactoring a code base won't work. Giving it an example can help dramatically. If you haven't "trained" it by giving it context or adding your CLAUDE.md file, you'll end up finding it doing things you don't want it to do.
Another great task I've been giving it while I'm working on other things is generating docs for existing features and modules. It is surprisingly good at looking at events and following those events to see where they go and generating diagrams and he like.
These seem like good checkpoints (and valid criticisms) on the road to progress.
But it's also not crazy to think that with LLMs getting smarter (and considerable resources put into making them better at coding), that future versions would clean up and refactor code written by past versions. Correct?
I've found the Cursor autocomplete to be nice, but I've learned to only accept a completion if it's byte for byte what I would've written. With the context of surrounding code it guesses that often enough to be worth the money for me.
The chatbot portion of the software is useless.
I’ve been starting my prompts more and more with the phrase „Let’s brainstorm“.
Really powerful seeing different options, especially based on your codebase.
> I wouldn't give them a big feature again. I'll do very small things like refactoring or a very small-scoped feature.
That really resonates with me. Anything larger often ends badly and I can feel the „tech debt“ building in my head with each minute Copilot is running. I do like the feeling though when you understood a problem already, write a detailed prompt to nudge the AI into the right direction, and it executes just like you wanted. After all, problem solving is why I’m here and writing code is just the vehicle for it.
I've gone the other way and put a lot of effort into figuring out how to best utilize these things. It's a rough learning curve and not trivial, especially given how effortless stuff looks and feels at first.
I think that people are just too quick to assume this is amazing, before it is there. Which doesn't mean it won't get there.
Somehow if I take the best models and agents, most hard coding benchmarks are at below 50% and even swe bench verified is like at 75 maybe 80%. Not 95. Assuming agents just solve most problems is incorrect, despite it being really good at first prototypes.
Also in my experience agents are great to a point and then fall off a cliff. Not gradually. Just the type of errors you get past one point is so diverse, one cannot even explain it.
LLMs give you a power tool after you spent your whole career using hand tools.
A chainsaw and chisel do different things and are made for different situations. It’s great to have chainsaws, no longer must we chop down a giant tree with a chisel.
On the other hand there’s plenty of room in the trade for handcraft. You still need to use that chisel to smooth off the fine edges of your chainsaw work, so your teammates don’t get splinters.
I wish there was a browser addon that worked like ublock but for LLM talk. Like just take it all, every blog post, every announcement, every discussion and wipe it all away. I just want humanity to deal with some of our actual issues, like fascism, war in Europe and the middle east, the centralization of our lines of production, the unfairness in our economies.
Instead we're stuck talking about if the lie machine can fucking code. God.
I find LLMs very efficient in a lot of things except writing big chunks of code that require some organization. They’re good at closing knowlwdge gaps, finding strategies to solve problems, writing code for defined scopes and even reviewing large PRs. However, as others said they can easily alienate you from your own project.
LLMs require fuzzy input and are thus good for fuzzy output, mostly things like recommendations and options. I just do not see a scenario where fuzzy input can lead to absolute, opinionated output unless extremely simple and mostly done before already. Programming, design, writing, etc. all require opinions and an absolute output from the author to be quality.
Using gemini cli, I really need to try out claude code 1 day, and you ask it to make a change and it gives you the diff on what it plans to change.
You can say no, then give it more specific instructions like "keep it more simple" or "you dont need that library to be imported"
You can read the code and ensure you understand what it's doing.
Using AI definitely makes me more productive. However I spend 80% of my "AI time" fixing mistakes made by the AI or explaining why the solution doesn't work.
This is still an ad that tries to lure heretics in by agreeing with them. This is the new agile religion. Here are suit-optimized diagrams:
https://zed.dev/agentic-engineering
"Interwoven relationship between the predictable & unpredictable."
> Alberto initially embraced LLMs with genuine enthusiasm, hoping they would revolutionize his development workflow.
Is there any other way but down from such revolutionary ungrounded expectations?
Just like garbage sources on search engines or trash stack overflow answers. There’s still plenty of junk to sift through with LLM.
LLM will even through irrelevant data points in the output which causes further churn.
I feel not much has changed.
Unfortunately it seems that nobody 'dialed back' LLM usage for the summary on that page - a good example of how un-human such text can feel to read.
Previous discussion on the linked blog post:
https://news.ycombinator.com/item?id=44003700
It’s almost like the LLMs are simply a glorified autocomplete and have no actual understanding of anything they’re doing. Huh weird!
Llms are not a magic wand you can wave at anything and get your work cut out for you. What's new ?
Credit to the Zed team here for publishing something somewhat against its book.
I spent today rewriting a cloud function I'd done with the "help" of an LLM.
Looked like dog shit, but worked fine till it hit some edge cases.
Had to break the whole thing down again and pretty much start from scratch.
Ultimately not a bad day's work, and I still had it on for autocomplete on doc-strings and such, but like fuck will I be letting an agent near code I do for money again in the near future.
LLMs save me a lot of time as a software engineer because they save me a ton of time doing either boilerplate work or mundane tasks that are relatively conceptually easy but annoying to actually have to do/type/whatever in an IDE.
But I still more-or-less have to think like a software engineer. That's not going to go away. I have to make sure the code remains clean and well-organized -- which, for example, LLMs can help with, but I have to make precision requests and (most importantly) know specifically what I mean by "clean and well-organized." And I always read through and review any generated code and often tweak the output because at the end of the day I am responsible for the code base and I need to verify quality and I need to be able to answer questions and do all of the usual soft-skill engineering stuff. Etc. Etc.
So do whatever fits your need. I think LLMs are a massive multiplier because I can focus on the actual engineering stuff and automate away a bunch of the boring shit.
But when I read stuff like:
"I lost all my trust in LLMs, so I wouldn't give them a big feature again. I'll do very small things like refactoring or a very small-scoped feature."
I feel like I'm hearing something like, "I decided to build a house! So I hired some house builders and told them to build me a house with three bedrooms and two bathrooms and they wound up building something that was not at all what I wanted! Why didn't they know I really liked high ceilings?"
When I first came across the idea of vibe coding, my first reaction was that this was taking things too far. Isn't it enough that your LLM can help you do - autocomplete - suggest possible solutions to a problem you've taken the time to understand - helps you spend less time reading documentation and instead help guide your approach and sometimes even helps you identify obscure apis that could help you get shit done - help you review your code - come up with multiple designs for a solutions - evaluate multiple designs you come up with for trade-offs - help you understand your problem better and the available apis - write a prototype of some piece of code
I feel like LLMs are already doing quite a lot. I spend less time rummaging through documentation or trying to remember obscure api's or other pieces of code in a software project. All I need is a strong mental model about the project and how things are done.
There is a lot of obvious heavy lifting that LLMs are doing that I for one am not able to take for granted.
For people facing constraints similar to those in a resource constrained economic environment, the benefits of any technology that helps them spend less time doing work that doesn't deliver value is immediately visible/obvious/apparent.
It is no longer an argument about whether it is a hype or something, it is more about how best to use it to achieve your goals. Forget the hype. Forget the marketing of AI companies - they have to do that to sell their products - nothing wrong with that. Don't let companies or bloggers set your own expectations of what could or should be done with this piece of tech. Just get on the bandwagon and experiment and find out what is too much. In the end I feel we will all come from these experiments knowing that LLMs are already doing quite a lot.
TRIVIA I even came by this article https://www.greptile.com/blog/ai-code-reviews-conflict. That clearly pointed out how LLM reliance can bring both the 10x dev and 1x dev closer to a median of "goodness". So the 10x dev is probably worse and the 1x dev ends up getting better - I'm probably that guy because I tend to mis subtle things in code and copilot review has had my ass for a while now - I haven't had defects like that in a while.
We need an app to rate posts on how clickbaity their titles are, and let you filter on this value.
Is this Zed Shaw’s blog?
Uh oh, I think the bubble is bursting.
Personally the initial excitement has worn off for me and I am enjoying writing code myself and just using kagi assistant to ask the odd question, mostly research.
When a team mate who bangs on about how we should all be using ai tried to demo it and got things in a bit of a mess, I knew we had peaked.
And all that money invested into the hype!
"Why I'm dialing back my high level language usage"
Too many of us fall into “prompt autopilot” mode—reaching for AI before we think. Your post calls it out beautifully: step back, reclaim the muscle memory of creative problem solving. LLMs should supplement, not substitute. That discipline often separates thoughtful integration from dependency.
[dead]
The problem with zed's narrative is that because he failed to use it in productive ways he wants to dial it back altogether but its not clear what he has actually attempted and people dogpiling here reminds me of artists who are hostile to AI tools, it doesn't accurately reflect the true state of the marketplace which actually puts a lot of value on successful LLM/AI tool use especially in the context of software development.
If you extrapolate this blog then we shouldn't be having so much success with LLMs, we shouldn't be able to ship product with fewer people, and we should be hiring junior developers.
But the truth of the matter is, especially for folks that work on agents focusing on software development is that we can see a huge tidal shift happening in ways similar to artists, photographers, translators and copywriters have experienced.
The blog sells the idea that LLM is not productive and needs to be dialed down does not tell the whole story. This does not mean I am saying LLM should be used in all scenarios, there are clearly situations where it might not be desirable, but overall the productivity hinderance narrative I repeatedly see on HN isn't convincing and I suspect is highly biased.