This has been a very long time coming and the crackup we're starting to see was predicted long before anyone knew what an LLM is.
The catalyst is the shift towards software transparency: both the radically increased adoption of open source and source-available software, and the radically improved capabilities of reversing and decompilation tools. It has been over a decade since any ordinary off-the-shelf closed-source software was meaningfully obscured from serious adversaries.
This has been playing out in slow motion ever since BinDiff: you can't patch software without disclosing vulnerabilities. We've been operating in a state of denial about this, because there was some domain expertise involved in becoming a practitioner for whom patches were transparently vulnerability disclosures. But AIs have vaporized the pretense.
It is now the case that any time something gets merged into mainline Linux, several different organizations are feeding the diffs through LLM prompts aggressively evaluating whether they fix a vulnerability and generating exploit guidance. That will be the case for most major open source projects (nginx, OpenSSL, Postgres, &c) sooner rather than later.
The norms of coordinated disclosure are not calibrated for this environment. They really haven't been for the last decade.
I'm weirdly comfortable with this, because I think coordinated disclosure norms have always been blinkered, based on the unquestioned premise that delaying disclosure for the operational convenience of system administrators is a good thing. There are reasons to question that premise! The delay also keeps information out of the hands of system operators who have options other than applying patches.
> It has been over a decade since any ordinary off-the-shelf closed-source software was meaningfully obscured from serious adversaries.
Probably goes without saying but the last line of defense is not deploying your software publicly and instead relying on server-client architectures to do anything. Maybe this will be more common as vulnerabilities are more easily detected and exploited. Of course its not always feasible.
It has been annoying seeing my (proguard obfuscated) game client binaries decompiled and published on github many times over the last 11 years. Only the undeployed server code has remained private.
Interestingly I didn't have a problem with adversaries reverse engineering my network protocols until I was updating them less frequently than weekly. LLM assisted adversaries could probably keep up with that now too.
Because while you could get something that drives a dumb interface, by moving the work and data to the server it's not available for the emulation software to use.
If the contract is well defined, the LLM can infer what it's purpose is, implementation, possibly even your secret sauce. There is no software moat anymore.
yes this is what i was trying to say. its quite common on older client-server games to do this sort of thing. powerful ai models will just make the work to recreate/emulate servers faster.
Except that emulating what is seen is surprisingly useful to find attack vectors. As a single deeper datapoint, one can look at more than just baseline behavior and delve into timing details to further refine implementation guesses.
> based on the unquestioned premise that delaying disclosure for the operational convenience of system administrators is a good thing. There are reasons to question that premise!
Care to mention these reasons?
With "convenience of system administrators", I'm guessing you mean that there's a patch available that sysadmins can install, ideally before the vulnerability is disclosed? What else are sysadmins supposed to do, in your opinion? Fix the vulnerability themselves? Or simply shutdown the servers?
With the various copyfails of recent, it at least was possible to block the affected modules. If that were not the case, what would you have done, as a sysadmin?
> BinDiff: you can't patch software without disclosing vulnerabilities
That’s why Microsoft has been obfuscating its binary builds for at least the last two decades so that even the two builds from the same source would produce very different blobs.
It was a part of our Windows build process when I was at Microsoft. I only assumed that they would keep doing it, but they might have as well dropped the practice.
I believe this premise that the cost of identification of vulnerabilities via diffs is going down over time begs the question "what do our processes need to look like if simply making the patch public is the disclosure?"
Current coordinated disclosure practices have a dependency on patching and disclosure being separate, but the gap between them seems to be asymptomatically approaching zero.
Right, all I'm saying is that we were asymptotically close many years ago; all that's changed is that nobody can kid themselves about it anymore.
The actual policy responses to it, I couldn't say! I've always believed, even when there was a meaningful gap between patching and disclosing, that coordinated disclosure norms were a bad default.
I always understood the business reasons that brought about coordinated vulnerability disclosure & I've been forced to toe this line at employers, but I've always been firmly in the full disclosure camp. I am so ready for this.
Day -X + 1: Engineer at Alibaba finds the vuln and tells Apache. Patch is pushed to git while new release is coordinated.
Day -X: A black hat sees commits fixing the bug. Attacks start happening.
Day 0: Memes start circulating in Minecraft communities of people crashing servers. Some logs are shared on Twitter, especially in China, of people getting pwned.
Day 0 + ~4 hours: My friend DMs me a meme on Twitter. I look up to find the CVE. Doesn't exist. My friend and I reproduce the exploit and write up a blog post about it. (We name it Log4Shell to differentiate it from a different, older log4j RCE vuln)
Day ~1: Media starts picking it up. Apache is forced to release patches faster in response. CVE is actually published to properly allow security scanners to identify it.
Today: AI makes this happen faster and more consistently. Patches probably should be kept private until a coordinated disclosure happens post-testing and CVE being published?
Hard to say what the right move is, but this is gonna be happening a lot over the next 1-3 years. Lots of companies are going to be getting cooked until AI helps us patch faster than attackers can exploit these fresh 0-days.
I’m with you until that last sentence, which I’ve been thinking about as “… until AI code testing, vulnerability scanning, and developer support tools help to limit the number of 0-days and vulnerabilities making it into production”.
So prevention will be more important than ai-assisted rapid containment or patching, though both of those capabilities will be necessary as part of defense in depth.
And some sort of AI-enabled security analysis across the organization’s architecture that is done as part of testing ahead of new software entering production to ID potential vulnerabilities caused by configuration changes or upgrades that modify how systems interact with each other.
I’ve been trying to guess the timeframe for seeing improved secure development, but I’m hoping it’s a bit closer to 6 months - 1 year given the speed of AI adoption and AI progression. May be closer to 3 years as you stated.
In the meantime, is there more to be done than this (not in order)?
- Patch COTS software
- re-evaluate the scoring for previous vulnerabilities
- set up up containment measures capabilities for systems that can’t be patched / high risk vendors
- use frontier model vuln scanning and patching for home grown systems that may have more 0-days than COTS depending on the organization’s capability
- limit the number of vendors / simplifying the tech stack.
I’d be happy to hear how others are thinking about this.
we simply can't absolve ourselves of responsibility in input and expect a hardened output. It's ABSOLUTELY up to the engineers to have test harnesses and scenarios for testing, vulnerability scanning, etc. Just because we can move faster via prompts doesn't mean we neglect the SDLC.
I think there's opportunity to reinvent the pipeline with AI powered tools to assist but the onus is still on the person to ensure they are deploying something that has been tested.
This feels more like an old problem getting reframed as an AI problem.
people were already diffing kernel commits and figuring out which ones were security fixes long before llms. if a patch lands publicly, the race has basically already started.
also not sure shorter embargoes really help. the orgs that can patch in hours are already fine. everyone else still takes days or weeks.
if anything, cheaper exploit generation probably makes coordinated disclosure more important, not less.
> people were already diffing kernel commits and figuring out which ones were security fixes
With skill, and usually not consistently and systematically. With AI, anyone can do this to any software.
> not sure shorter embargoes really help
Why 90 days versus 2 years? The author is arguing the factors that set that balance have shifted, given the frequency of simultaneous discovery. The embargo window isn’t an actual window, just an illusion, if the exploit is going to be found by several people outside the embargo anyway.
> cheaper exploit generation probably makes coordinated disclosure more important
I agree. But it also makes it less viable. If script kiddies can find and exploit zero days, the capacity to co-ordinate breaks down.
There was always a guild ethic that drove white-hate (EDIT: hat) culture. If the guild is broken, the ethic has nothing to stand on.
> With skill, and usually not consistently and systematically.
How do you know? If the people who like to crow about vulnerabilities aren't doing it, it doesn't mean that the people who are actually in a position to exploit them systematically and effectively aren't doing it.
Those embargoes have always been dangerous, because they create a false sense of security. But, as you point out...
> With AI, anyone can do this to any software.
Yep. Even if it hadn't been true before, it's clear that now you just have to assume that everybody relevant will immediately recognize the security impact of any patch that gets published. That includes both bugs fixed and bugs introduced.
... and as the AI gets better, you're going to have to assume that you don't even have to publish a patch. Or source code. Within way less time than it's going to take people to admit it and adjust, any vulnerability in any software available for inspection is going to be instant public knowledge. Or at least public among anybody who matters.
>any vulnerability in any software available for inspection is going to be instant public knowledge. Or at least public among anybody who matters.
Shouldn't this naturally lead to a state where all (new) code is vulnerability-free? If AI vulnerability detection friction becomes low enough it'll become common/forced practice to pre-scan code.
The point is that even if all code commits are scanned as safe by ai, black hats can still analyse the commits and diffs to find vulnerabilites for people who havent patched yet.
Scanning every commit doesnt automatically make everyone in the world patch immediately, vulns can still be found from commits and diffs and used against those who havent patched yet.
We know because we could see the effects of the average rate of vulnerabilities discovery and exploitation, and it's definitely going up very fast. Until recently, vulnerabilities were relatively hard to find, and finding them was done by a very restricted group of people world-wide, which made them quite valuable. Not any more.
It could equally be argued that the AI slop that's being produced makes for a lot more vulnerabilities being shipped. The bigger target makes for the easier discovery.
Certainly, and some discoveries have been attributed to AI (I was reading that mozilla firefox were praising mythos recently)
But that's not accounting for all of the discoveries, not at all.
I've also seen the npm people talking about the surge in AI code overwhelming the ability to properly review what's being distributed, and a large number of vulnerabilities being attributed to that
It's likely varies enormously between projects. Linux remains extremely low in slop, and the vulnerabilities being fixed are quite old, so it's improving. Many vibe coded projects are very sloppy, and are adding a lot of vulnerabilities.
Total number of vulnerabilities likely goes up over time weighting all projects equally, but goes down over time weighting by usage.
Security researcher Dor Zvi and his team at the cybersecurity firm he cofounded, RedAccess, analyzed thousands of vibe-coded web applications created using the AI software development tools Lovable, Replit, Base44, and Netlify and found more than 5,000 of them that had virtually no security or authentication of any kind. Many of these web apps allowed anyone who merely finds their web URL to access the apps and their data. Others had only trivial barriers to that access, such as requiring that a visitor sign in with any email address. Around 40 percent of the apps exposed sensitive data, Zvi says, including medical information, financial data, corporate presentations, and strategy documents, as well as detailed logs of customer conversations with chatbots.
I mean - you're spot on - which is why I'd be more inclined to ask for actual metrics rather than feels/vibes, and I'd be very clear that the information I was basing my thinking on has enormous pitfalls.
This is the basis for "correlation points to possibly fertile grounds for an investigation"
Pragmatically, correlation *is* evidence of causation in favour of the best explanation, until somebody finds a better explanation.
> It could equally be argued that the AI slop that's being produced makes for a lot more vulnerabilities being shipped.
This is also true, and does not exclude the other, because for the moment the vast majority of production software in the world (and therefore the bulk of enticing targets) was written before AI. If LLM software will become prevalent in commercial setups, then LLM-generated code will eventually become the majority of targets.
> Pragmatically, correlation is evidence of causation in favour of the best explanation, until somebody finds a better explanation.
Uh, no.
Correlation is only ever one thing - cause for investigation.
Everything based on correlation alone is speculation.
You can speculate all you like, I have zero issue with that, but that's best prefaced with "I guess"
edit: Science captures this perfectly, and people misunderstand this so fundamentally that there is a massive debate where people who think they are "pro science" argue this so badly with theists that they completely hoist themselves with their own petard.
Science uses the term "theory" because all of our understanding is based on "available data" - and science biggest contribution to humanity is that it accepts that the current/leading THEORY can and will be retracted if there is compelling data discovered that demonstrates a falsehood.
So - because I know this is coming - yes science is willing to accept some correlation - BUT it's labelled "theory" or "statistically significant" because science is clear that if other data arises then that idea will need to be revisited.
You have moved from "We know" to "We have an educated guess" which is the right way to couch things.
However I wanted to also point out that relying only on educated guesses can lead us into a position where we are "papering over the cracks" or "addressing the symptoms", not the "underlying cause"
Yes, sometimes that's all that can be done, but, also, sometimes it can be more damaging than the cause itself (thinking in terms of the cause continuing to fester away, whilst we think it's 'solved')
> You have moved from "We know" to "We have an educated guess"
No. You kept blabbering about "science" when most uses of knowledge are not about science. The original topic was also definitely not "science": it was about having a reasonable opinion about whether, empirically, the rate of discovery of vulnerabilities is increasing or not.
Trying to reframe this as 'not science' after being caught on a logical fallacy doesn't change the record. You started with a definitive claim ('We know') to shut down a question. When challenged on the lack of causation, you pivoted to 'educated guesses.'
My point remains: if we misattribute the cause of the rising vulnerability rate (discovery vs. creation), our 'educated guesses' will lead to solutions that address the symptoms while the underlying problem continues to fester. Calling precision 'blabbering' is exactly how we end up with the 'false sense of security' mentioned earlier.
Exhibit A:
ragall 2 hours ago | root | parent | prev | next [–]
> How do you know?
We know because we could see the effects of the average rate of vulnerabilities discovery and exploitation, and it's definitely going up very fast. Until recently, vulnerabilities were relatively hard to find, and finding them was done by a very restricted group of people world-wide, which made them quite valuable. Not any more.
Exhibit B:
ragall 2 hours ago | root | parent | next [–]
Very often you only have limited time for investigation and you have to act now. Action is almost always based on educated guesses.
reply
> people were already diffing kernel commits and figuring out which ones were security fixes
With skill, and usually not consistently and systematically. With AI, anyone can do this to any software.
I would like to see actual evidence of this, not.. vibes
I mean, this reeks of "Anyone is a Principal developer now" when the truth is there is still work to do.
I haven't been keeping tabs for the entirety of Linux development, but has it ever happened before that someone dropped a working exploit from the mailing list before the patch even hit the kernel?
I haven't seen this kind of thing and I get the impression, despite all the hype, that this will be a frequent phenomenon now thanks to LLMs.
I don't think hot patching holds the same relevance it did in 2010.
Much of today's workloads are containerized and run on roughly ephemeral nodes that can be switched out easily- K8s version upgrades force this more or less. We tent to run more and more of-the shelf hardware and worry less about individual node failures now.
In-memory updates also not magic , and can be limited as they requires data structure semantics to not really change and can create its own class of issues/bugs including security ones.
While am sure there are still use cases which dictate this type of update, the need is lot less than 15 years ago that the patent expiry will do much to the ecosystem.
The US is at war. Much of the world is at war at the cyber attack level right now.
The US, the EU, most of the Middle East, Israel, Russia...
Major services have been attacked and have gone down for days at a time - Ubuntu, Github, Let's Encrypt, Stryker. Entire hospital systems have had to partially shut down.
Now, in the middle of this, AI has made attacks much faster to generate. Faster than the defensive side can respond. Zero-day attacks used to be rare. Now they're normal.
It's going to get worse before it gets better. Maybe much worse.
If we assume that there will be an AI that is perfect in terms of ability to find vulnerabilities, cheap to run and widely available to everyone, then anyone can run it on any piece of software before deploying it. All vulnerabilities get found before they can be exploited.
One of the big challenges with cybersecurity is that attackers only need to find one exploit, while defenders need to stop everything. When you have a large surface area and limited resources, it's much easier to be the side that only has to succeed once. AI eliminates the limited resources problem.
Right now we are at a point in time when AI can find bugs for attackers and defenders, but defenders did not fix/find those bugs yet.
In time most of the bugs AI can find will be fixed, and things will calm down. Some bugs will be left, but will be too complex to find and weaponise (or rarely).
Alin short, attackers have advantage for a brief time now, but ultimately defenders will win. I guess this "fight" might be over before the end of the year.
1) Make it a law that companies have to vet their code for security holes before release, 2) Make it a law that companies have to apply operational security best practice on their software products/services, 3) Industry standard automation for improvements to patch lifecycle management, 4) Auditing for critical businesses and industries to ensure safety (both as a national security thing and general safety/reliability/privacy/etc)
Right now all that stuff is optional, so most companies don't do it, which makes more security holes and it takes longer to patch.
We could get somewhere where clouds can provide a framework of secure primitives that act as a framework.
E.g. you build an app, it stores data via api etc. etc. You can test in sandbox. The cloud deploys for customer who paid you via that cloud and you work at arms length. You may not even know their name. You just get the pro subscription fees.
The idea bubbling in my head would be an app store for cloud products. But with competition i.e. you use Railway or Heroku or AWS for the best deal.
Be gentle this is an idea in my head I am sure it can be torn down by a retort at this stage. But this exists in forms and I think it will emerge. It is inversion of control at the entire app level.
This is similar to buying a hammer. If you make hammers you sell them to a store, the store knows the customer and only the customer can see the nails.
No, it's similar to letting someone else do all your hammering because using a hammer is too dangerous. And then, to make the process more efficient, letting them take control of your home to be able to provide hammering services while making sure you can't touch the hammer.
You're assuming the fee would be small. Put yourself in the shoes of an insurance company, deciding what to charge for liability insurance. The potential cost if you have to pay out on the insurance is very very large: depending on the project, software vulnerabilities can cause millions to billions of damage to the economy. And the chance of you having to pay out is a complete unknown.
Unknown chance of having to pay out x large payout amount if you do = very very high premiums. Or not being willing to underwrite the insurance at all.
Remember, insurance is just gambling. The company is betting that the amount of money they'll make from everyone's total premiums added together is greater than the amount they'll have to pay out. Dumb gamblers don't last long. Smart gamblers will evaluate the risk and say "Okay, that'll be $X million a month in premiums", or even "Nope, we won't cover you". Can most open-source projects afford that?
I am looking at the results of a mass vulnerability scan as I type this. Half of the bugs in one case are in fact (binary) parser errors for hand-written parsers. These really should not exist in any language - but in C it's particularly bad. Kaitai Struct or something similar would broadly have prevented these. Rust would help here, but less than a parser generator (because it could automate error checking insertion for things that aren't just out of bound access).
However, half of the vulnerabilities are logic errors in terms of what I would call RBAC enforcement, incorrect access permissions, and so on. Rust won't help at all with any of these.
I was just working on a system best thought of as a “dinosaur”: written almost entirely in C (and a bit of PERL) and running on an appliance with BSD as the kernel.
It’s full of bugs and has had a string of RCE vulnerabilities published recently, probably because of Mythos.
Working with it day to day I get this feeling that the tech stack used results in a system that’s… clumsy and constrained.
Little things give me that impression, and I can’t quite put it in words, but it’s thirty years of experience working with dozens of languages and platforms speaking here.
Using C makes you clumsy.
It makes you trip over things other languages don’t.
It makes it obscenely difficult to do even simple things. It’s like trying to put a delicate ship into a bottle while wearing oven mitts.
Switching to a better language isn’t just about the specific capabilities of its compiler, it’s also about what it enables in the humans using it.
AI will shorten update windows dramatically. 2026 is the worst year to be thinking about dependency cooldowns, we need to think about dependency warmups instead.
Soon, there will be no such thing as a safe way to disclose a vulnerability in an open source project. Centralized SaaS will have a major security advantage here.
You could have a web of trust where Linux-using organizations each spend $x continuously scanning and patching their own dependencies with AI, and sending each other patches and scans.
I have unlimited access to every single frontier model, I've tested all of them, they are not good at writing software.
They are basically slot machines, sometimes you win a little bit and sometimes you win a lot but usually you just burn a ton of time and money sitting and staring at a screen (and frying your brain).
> There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies." — Tony Hoare
Obviously the solution is for Linux to move to a closed-source development model.
Security researchers should report their findings to a committee that includes some big companies (IBM and Oracle seem like trustworthy choices here, but ideally we should find a way to get Microsoft included). Those companies would apply the security patches and distribute binary builds of Linux to their customers. Users fortunate enough to have a business relationship with those companies would be protected immediately. The source would still be published after 90 days for educational purposes and for anyone who doesn't appreciate the security benefits of this approach.
"But even if you could convince people to collaborate like this for the greater good, the GPL makes it legally impossible", you say. Ah, but the GPL only says you have to make the source available for a minimal monetary cost, it doesn't impose a time limit. Traditionally, responding to source code requests with a snail-mailed CD is good enough. No judge in the US is going to rule that a short administrative delay in sending out those CDs - in the name of everyone's security, after all, and 90 days is nothing to the judicial system - violates a nebulous licensing agreement from a different era.
I like how after so many years, people finally start recognizing that obscurity is a part of security. Not the whole security, obviously, but a part of it.
The quick test doesn't show a lot - by out straight asking if this is a security patch, it implies and guides AI to have output more probably to agree on this assumption. A confusion matrix is more useful. Nonetheless of course this is not a detailed ai capability testing blog.
I agree it is not much additional evidence! If someone wanted to try running the same test on a series of N commits from that list including this one I'd be very curious to see the answer!
Realistically, if you are scanning each kernel commit to check if they might be patching a security issue, you are going to be asking an LLM "is this security related, if so vaguely how" with low effort and taking "maybe" as a yes before feeding it to a more expensive model. You aren't trying to establish a probability of an ultimately unknowable fact, there is ground truth that you can find by producing an exploit, so you are just trying to pre-filter before spending the money to find it.
Yeah, ideally we would need the phi coefficient (aka MCC, the binary Pearson correlation), which can be calculated from a confusion matrix of yes/no LLM classifications for all kernel diffs. (Number of true positives, true negatives, false positives, false negatives.)
One interesting thing is that this makes closed source code even greater asset for the defenders. Attacker cannot spend tokens for it, but defenders can spend tokens for hardening based on source code, while attacker is stuck with blackbox testing.
You would be surprised how adept SOTA models are at reverse engineering with IDA/Ghidra or even plain old objdump. Opus basically knows IDAPython on the back of its hand.
They can be, but the most interesting parts (backend code, deployment confs) are not usually available. Reversing clients can help to understand a bit, but not with equal level.
The bugs are bugs description reads pretty insane to me personally but I know linux world has many people valueing principle of it over practical matters.
90d seems long too though.
Think ultimately the big AI houses will need to help the core internet infra guys. Running latest and greatest AI over stuff like nginx and friends makes sense for us all collectively I think
We need automated patch and release cycles. So far we've relied on incredibly slow manual processes to accept reports, investigate, verify, patch, and prepare releases. Releasing a fix often takes months. This is way too slow when attackers can just churn out new exploits in hours. We need to iterate on value chain bottlenecks to lower Mean Time To Patch.
We should be able to turn around a bug report to a patched product ready for QA testing in 1 hour. Standardize/open source it, have the whole software supply chain use it (ex. Linux kernel -> distros -> products that use distros -> users). With AI there's no reason we can't do this, we're just slow.
On the other hand, automated fast rollouts leads to a crowdstrike type situation where you brick all the computers of the world immediately.
Imo we are going to have to rely on more layers of security. Systems that are designed to be secure even in the presence of individual vulnerabilities. This has already been happening for a while on mobile platforms and game consoles. Even physical hardware designed to keep particular secrets /keys even from the kernel.
The crowdstrike situation wasn't due to fast rollouts, it was due to a total lack of testing. You can do fast rollouts, with testing, and a mandatory QA signoff. It's called 'continuous delivery' rather than 'continuous deployment'.
I actually don't think more layers of security will fix this. It would be nice if our systems were more secure... but people are, if nothing else, lazy af. Even when adding security isn't a lot of work, people resist it if it "sounds complicated". So I think we're stuck with the status quo. But the big issue now isn't novel bug types, it's the speed in which they're found. Therefore we need to speed up our response.
Lack of care, slip ups, and bugs are basically a constant that will always exist. But we can architect systems which are secure even in the face of bugs. Multiple layers of security mean that even the most critical kernel bug in iOS can never extract your faceid data or encryption key because the hardware physically isn’t capable of it. OSs like Qubes utilise multiple VMs so any kernel bugs have limited reach.
When you look at consoles, they have built software that is resistant to outright glitching the CPU.
Sounds like you're expecting the AI-based tools that are finding bugs to also provide fixes.
I've been dealing with a bunch of AI-generated (or at least -assisted) vulnerability reports lately. In many cases the reports include proposed patches to fix the issues.
It's been..... interesting. In many cases, the analysis provided in the report has been accurate and helpful. In some cases, the proposed patches have also been good, and we've accepted them with minimal or no changes.
In other cases, despite finding a valid issue, and even providing a good analysis of the problem, the AI tool's suggested patch has been, quite simply, wrong.
Careful review from somebody who really _understands_ the code -- and the wider context in which it is operating -- is still absolutely necessary. That's not always going to happen in an hour.
Yes, that's why I specified "patched product ready for QA testing". It speeds up the development cycle by making a first pass and ensuring it basically works before passing it to a developer for manual review and a QA tester to ensure the fix doesn't break anything else. Both dev and QA are still in the feedback loop and can make changes until it's ready for release
A 3rd culture - the "security though obscurity" culture where some random little library might be a potential weak link, but will anyone really bother to hack it?
Not as worrysome in a philosophical way (since it's not a serious culture) but it's a real issue. And just wait for a nation state to start astroturing helpful little libraries at scale ...
It sounds to me like the safe assumption with software is that no matter how solid your stack is, there are vulnerabilities, potentially catastrophic. A question to folks more experienced than me - if my business depends on software, and I know that my software is almost certainly exploitable, how do I posture my business in such a way as to minimize the impacts of exploits like these?
When Windows was the predominant desktop OS in the 90s and maybe early 00s (ok, maybe still is), it was so badly insecure that you could be pretty much sure that it would be easy to compromise.
That's when firewalls were widely deployed to provide some layer of protection.
So you can ask yourself, what is the (possibly metaphorical) firewall in the software you depend on?
Is there any way you can decrease attack surface, separate out the most important data in extra-secure (and thus less accessible) systems?
I must admit I'm rather enjoying this particular form of shit show, mostly because it was a predication I made in 2023 in the early days of LLMs. It wasn't really a problem related to LLMs but a glaring hole in the thinking of current computing which is the "frustratingly over-connected" and "over-trust" approach to everything. After reading Liu Cixin's "three body problem" and noting the Dark Forest, I applied that to risk vectors and came to the conclusion that our over-connected nature plus some form of acceleration plus some form of negative impact will fuck us big time.
Turns out it did.
Thus we should probably start treating our thinking model of computing as a Dark Forest, not a friendly community. That mitigates these risks to some degree.
If you're into gaming, Cyberpunk 2077 essentially plays in a heavily technologized world, where all compute infrastructure is infested with rogue AI that replicates itself to any technology it can physically get in touch with. The only recourse is a new web, built from first principles, protected by (probably) benevolent AI systems. Every device, every server, is partially occupied by AIs doing their thing on it, virtually networked into a digital universe.
I found that a fascinating thought.
I'd argue it's actually breaking three vulnerability cultures. In addition to the two Jeff mentions, I think the culture of delaying upgrades and staying on stable versions for as long as possible is going to become increasingly untenable, if everything that's not latest can be trivially scanned and exploited. In the extreme I think there's a decent chance projects like Debian might have to radically overhaul or just shut down completely - the whole philosophy of slow and steady with old code just won't work.
There will be much wailing and gnashing of teeth around this, because a lot of tech types really resent having to update constantly, but I don't think people will have a choice. If you have a complicated stack where major or even minor version updates are a huge hassle, I'd start working now to try and clear out the cruft and grease those wheels.
> there's a decent chance projects like Debian might have to radically overhaul or just shut down completely - the whole philosophy of slow and steady with old code just won't work.
Debian continuously issues security updates for stable versions, ingestable with automatic updates. “Stable” doesn’t mean that vulnerabilities aren’t getting fixed.
The argument that could be made is that keeping up with getting vulnerabilities fixed might become such a high workload that fewer releases can be maintained in parallel, and therefore the lifetime and/or overlap of maintained releases would have to be reduced. But the argument for abandoning stable releases altogether doesn’t seem cogent.
It goes both ways: Stable code that only receives security updates becomes less vulnerable over time, as the likelihood of new vulnerabilities being introduced is comparatively low. From that point of view, stable software actually has a leg up over continuous (“eternal beta” in the worst case) functional updates.
I can only dream, but this may re-popularize (among the rest of the non-Debian software industry) the general best practice of keeping a "sustaining" branch green, buildable, and with frequent releases, for security fixes.
I hate software that forces you to take new features as a condition of obtaining bug and security fixes. We need to keep old "stable" builds around for longer and maintain them better. I know, I know, it is really upsetting to developers to have to backport things to old versions--they wish that all they had to work on was the current branch. But that just causes guys like me to never upgrade because the downside of upgrading (new features) is worse than the upside (security fixes).
> In the extreme I think there's a decent chance projects like Debian might have to radically overhaul or just shut down completely - the whole philosophy of slow and steady with old code just won't work.
It may actually be the opposite.
Debians steady and professional approach on shipping security patches with very little to no functional difference actually enables us to consider and work on automated, autonomous weekly or faster patches of the entire fleet. And once that's in place and trusted, emergency rollouts are very possible and easy.
We have other projects that "move fast and break things" and ship whatever they want in whatever versions they want and those will require constant attention to ship any update for a security topic. These projects require constant human attention to work through their shenanigans to keep them up to date.
Not only that but debian has for example, debsecan so you can see on any system what CVEs exist and if your packages are patched. ex from my system I ran it and got
That's not really the culture of debian to be honest. Yes they run old major and minor versions, but they do ship patch updates as fast as they can. Even on debian stable, you absolutely are supposed to update all the time. The culture of "just don't touch it" is a different one (but also exists, I've seen it).
Oh yeah, to be clear: Debian has always been good about quickly shipping patches to kernel vulnerabilities, and they will continue to be so. I was more thinking about whether they will get overwhelmed if every bit of software they package just has a firehose of vulnerabilities on everything which isn't latest.
We are now paying for the sins of our fathers (well and mostly ourselves).
We've just kept building more complex things with more exposure with no recognition that the day of
reckoning was coming. And now we are in an untenable situation. With governments spending billions on AI with the big providers it's likely they've found many of these already.
> On the other side you have "bugs are bugs" culture. This is especially common in Linux, where the argument is that if the kernel is doing something it shouldn't then someone somewhere may be able to turn it into an attack. Just fix things as quickly as possible, without drawing attention to them. Often people won't notice, with so many changes going past, and there's still time to get machines patched.
The 3rd one is what I practice when giving companies time to fix their issue. Note, I haven't reported anything to FOSS projects, but to several companies I found exploits in. I give them 5 days. If they don't respond at all in the first day, I deduct 1 day - apparently they're either incompetent or don't care. After the 5 days have passed, I make it public. So far they've all fixed the issue on the 3rd or 4th day.
If I were to report something to a FOSS project, I'd give them a bit more, say 8-9 days. Enough time for everyone to wake up, review the vuln, patch it and ship it. Enough time for all the downstream projects to also ship the patch.
90 days is ridiculous, especially for companies. If I report something on Friday 23:30 and they reply Monday 15:00 - what were they doing during the weekend? Did they forget their software is used 24/7? I had one company complain quite a bit, threatening to sue. When they realized there was no one to sue (me being anonymous with my report), they fixed it in less than a day.
Bottom line - if you're a company offering a product or service, you should have a security team 24/7.
If you're a FOSS project - either alert your users to stop using your software or disable the service yourself, if you can.
If it's an extremely important life-or-death service you can't shut off - then fix it quickly. What are you doing with life-or-death stuff when you can't react quickly enough?
Fuck the 90 days standard - it's what companies want us to do because it's easier on them. If security hasn't been your top priority, you have a few days to make it your top priority.
With AI, that makes even more sense now. Bugs won't be able to stay hidden for months. Especially bugs I've reported like IDORs or SQL injections - things everyone tries first.
(and I love Linux, but getting an "Oh noes!" from Anubis at kernel.org because I don't have cookies enabled (I do??) really makes me not want to report anything to the Linux kernel in particular. If I ever did find something, I'd just immediately post it as a HN comment or something like that)
It depends on the kind of vulnerability, but sometimes in order to fix a problem, you need to do an enormous amount of software engineering. Which needs to be done to a very high standard, because the expectation is that people will push security patches more or less immediately to production.
Of course, this only works if no one else is likely to discover the vulnerability in the meantime!
The company can almost always shut down their service until they fix it. They'll lose money and their customers could also lose money if they depend on the service. That's the price they'll have to pay. Otherwise, they should either work frantically 24/7 to fix the vuln or if they can't, they should accept the fact that they've pushed code without any regard for security and bear the consequences.
Why do we need to put up with excuses? If a company has lots of complicated code that would need enormous amount of time to fix, it's on them. They decided to release this code into the wild.
If I publish the vuln publicly, the users would have the option to stop using the software/service until it's patched. If a customer is using a service without caring about security, it's on them. I want to protect the customers who would monitor the news for such vulns and protect themselves.
How would you apply this logic to something like https://meltdownattack.com ? The vulnerability was in hardware, discovered by companies that make user level software, and mitigated by changes to OS kernels.
Maybe it is about time for Linux to get a real CD/CI and start using AI extensively.
Not just for vulnerabilities, having a nice agents|skills|etc.md definitions would encourage new devs to contribute instead of dealing with an overworked maintener repeating the same thing for n time.
This has been a very long time coming and the crackup we're starting to see was predicted long before anyone knew what an LLM is.
The catalyst is the shift towards software transparency: both the radically increased adoption of open source and source-available software, and the radically improved capabilities of reversing and decompilation tools. It has been over a decade since any ordinary off-the-shelf closed-source software was meaningfully obscured from serious adversaries.
This has been playing out in slow motion ever since BinDiff: you can't patch software without disclosing vulnerabilities. We've been operating in a state of denial about this, because there was some domain expertise involved in becoming a practitioner for whom patches were transparently vulnerability disclosures. But AIs have vaporized the pretense.
It is now the case that any time something gets merged into mainline Linux, several different organizations are feeding the diffs through LLM prompts aggressively evaluating whether they fix a vulnerability and generating exploit guidance. That will be the case for most major open source projects (nginx, OpenSSL, Postgres, &c) sooner rather than later.
The norms of coordinated disclosure are not calibrated for this environment. They really haven't been for the last decade.
I'm weirdly comfortable with this, because I think coordinated disclosure norms have always been blinkered, based on the unquestioned premise that delaying disclosure for the operational convenience of system administrators is a good thing. There are reasons to question that premise! The delay also keeps information out of the hands of system operators who have options other than applying patches.
> It has been over a decade since any ordinary off-the-shelf closed-source software was meaningfully obscured from serious adversaries.
Probably goes without saying but the last line of defense is not deploying your software publicly and instead relying on server-client architectures to do anything. Maybe this will be more common as vulnerabilities are more easily detected and exploited. Of course its not always feasible.
It has been annoying seeing my (proguard obfuscated) game client binaries decompiled and published on github many times over the last 11 years. Only the undeployed server code has remained private.
Interestingly I didn't have a problem with adversaries reverse engineering my network protocols until I was updating them less frequently than weekly. LLM assisted adversaries could probably keep up with that now too.
>Only the undeployed server code has remained private.
How easy to do you this is for LLM to build decent emulator of the server in question by just observing what you send and what you get as response?
not sure why downvoted. server emulators will become faster to make. protocol analysis will become faster as well.
Because while you could get something that drives a dumb interface, by moving the work and data to the server it's not available for the emulation software to use.
If the contract is well defined, the LLM can infer what it's purpose is, implementation, possibly even your secret sauce. There is no software moat anymore.
yes this is what i was trying to say. its quite common on older client-server games to do this sort of thing. powerful ai models will just make the work to recreate/emulate servers faster.
Except that emulating what is seen is surprisingly useful to find attack vectors. As a single deeper datapoint, one can look at more than just baseline behavior and delve into timing details to further refine implementation guesses.
> based on the unquestioned premise that delaying disclosure for the operational convenience of system administrators is a good thing. There are reasons to question that premise!
Care to mention these reasons?
With "convenience of system administrators", I'm guessing you mean that there's a patch available that sysadmins can install, ideally before the vulnerability is disclosed? What else are sysadmins supposed to do, in your opinion? Fix the vulnerability themselves? Or simply shutdown the servers?
With the various copyfails of recent, it at least was possible to block the affected modules. If that were not the case, what would you have done, as a sysadmin?
Many vulnerabities seem to be in code paths for rarely used features. They can often be disabled.
> BinDiff: you can't patch software without disclosing vulnerabilities
That’s why Microsoft has been obfuscating its binary builds for at least the last two decades so that even the two builds from the same source would produce very different blobs.
Sounds dubious, do you have a citation? The disassembly looks very straightforward for a lot of Windows code.
They're not encoded, but the code blocks are shuffled. That's why disassembly does look straightforward, but it used to thwart BinDiff at the time.
If I understand correctly, that is just randomness comes from parallel compiling and linking.
If you saying there is a whole step just scrambling blobs, i will be very surprised.
What made you believe this is the case? any examples/links/etc.?
It was a part of our Windows build process when I was at Microsoft. I only assumed that they would keep doing it, but they might have as well dropped the practice.
How are they obfuscated?
See my sibling comment.
I believe this premise that the cost of identification of vulnerabilities via diffs is going down over time begs the question "what do our processes need to look like if simply making the patch public is the disclosure?"
Current coordinated disclosure practices have a dependency on patching and disclosure being separate, but the gap between them seems to be asymptomatically approaching zero.
Right, all I'm saying is that we were asymptotically close many years ago; all that's changed is that nobody can kid themselves about it anymore.
The actual policy responses to it, I couldn't say! I've always believed, even when there was a meaningful gap between patching and disclosing, that coordinated disclosure norms were a bad default.
What process or mechanism would you prefer to use instead of coordinated disclosure?
I always understood the business reasons that brought about coordinated vulnerability disclosure & I've been forced to toe this line at employers, but I've always been firmly in the full disclosure camp. I am so ready for this.
You’re obviously one of the most knowledgeable people on this topic around here.
What would the best solution be? And where do you believe the industry is headed (which may very well be something other than the best solution) ?
I can’t think about anything other than improving operations, but given the state of the industry, this seems like a pipe dream.
This is exactly what happened with Log4Shell.
Day -X + 1: Engineer at Alibaba finds the vuln and tells Apache. Patch is pushed to git while new release is coordinated.
Day -X: A black hat sees commits fixing the bug. Attacks start happening.
Day 0: Memes start circulating in Minecraft communities of people crashing servers. Some logs are shared on Twitter, especially in China, of people getting pwned.
Day 0 + ~4 hours: My friend DMs me a meme on Twitter. I look up to find the CVE. Doesn't exist. My friend and I reproduce the exploit and write up a blog post about it. (We name it Log4Shell to differentiate it from a different, older log4j RCE vuln)
Day ~1: Media starts picking it up. Apache is forced to release patches faster in response. CVE is actually published to properly allow security scanners to identify it.
Today: AI makes this happen faster and more consistently. Patches probably should be kept private until a coordinated disclosure happens post-testing and CVE being published?
Hard to say what the right move is, but this is gonna be happening a lot over the next 1-3 years. Lots of companies are going to be getting cooked until AI helps us patch faster than attackers can exploit these fresh 0-days.
I’m with you until that last sentence, which I’ve been thinking about as “… until AI code testing, vulnerability scanning, and developer support tools help to limit the number of 0-days and vulnerabilities making it into production”.
So prevention will be more important than ai-assisted rapid containment or patching, though both of those capabilities will be necessary as part of defense in depth.
And some sort of AI-enabled security analysis across the organization’s architecture that is done as part of testing ahead of new software entering production to ID potential vulnerabilities caused by configuration changes or upgrades that modify how systems interact with each other.
I’ve been trying to guess the timeframe for seeing improved secure development, but I’m hoping it’s a bit closer to 6 months - 1 year given the speed of AI adoption and AI progression. May be closer to 3 years as you stated.
In the meantime, is there more to be done than this (not in order)?
- Patch COTS software
- re-evaluate the scoring for previous vulnerabilities
- set up up containment measures capabilities for systems that can’t be patched / high risk vendors
- use frontier model vuln scanning and patching for home grown systems that may have more 0-days than COTS depending on the organization’s capability
- limit the number of vendors / simplifying the tech stack.
I’d be happy to hear how others are thinking about this.
we simply can't absolve ourselves of responsibility in input and expect a hardened output. It's ABSOLUTELY up to the engineers to have test harnesses and scenarios for testing, vulnerability scanning, etc. Just because we can move faster via prompts doesn't mean we neglect the SDLC.
I think there's opportunity to reinvent the pipeline with AI powered tools to assist but the onus is still on the person to ensure they are deploying something that has been tested.
This feels more like an old problem getting reframed as an AI problem.
people were already diffing kernel commits and figuring out which ones were security fixes long before llms. if a patch lands publicly, the race has basically already started.
also not sure shorter embargoes really help. the orgs that can patch in hours are already fine. everyone else still takes days or weeks.
if anything, cheaper exploit generation probably makes coordinated disclosure more important, not less.
> people were already diffing kernel commits and figuring out which ones were security fixes
With skill, and usually not consistently and systematically. With AI, anyone can do this to any software.
> not sure shorter embargoes really help
Why 90 days versus 2 years? The author is arguing the factors that set that balance have shifted, given the frequency of simultaneous discovery. The embargo window isn’t an actual window, just an illusion, if the exploit is going to be found by several people outside the embargo anyway.
> cheaper exploit generation probably makes coordinated disclosure more important
I agree. But it also makes it less viable. If script kiddies can find and exploit zero days, the capacity to co-ordinate breaks down.
There was always a guild ethic that drove white-hate (EDIT: hat) culture. If the guild is broken, the ethic has nothing to stand on.
> With skill, and usually not consistently and systematically.
How do you know? If the people who like to crow about vulnerabilities aren't doing it, it doesn't mean that the people who are actually in a position to exploit them systematically and effectively aren't doing it.
Those embargoes have always been dangerous, because they create a false sense of security. But, as you point out...
> With AI, anyone can do this to any software.
Yep. Even if it hadn't been true before, it's clear that now you just have to assume that everybody relevant will immediately recognize the security impact of any patch that gets published. That includes both bugs fixed and bugs introduced.
... and as the AI gets better, you're going to have to assume that you don't even have to publish a patch. Or source code. Within way less time than it's going to take people to admit it and adjust, any vulnerability in any software available for inspection is going to be instant public knowledge. Or at least public among anybody who matters.
>any vulnerability in any software available for inspection is going to be instant public knowledge. Or at least public among anybody who matters.
Shouldn't this naturally lead to a state where all (new) code is vulnerability-free? If AI vulnerability detection friction becomes low enough it'll become common/forced practice to pre-scan code.
Finding a vulnerability by looking at the diff that fixed it is very different than just looking through the code.
They're saying to do that scan to every diff before release, to see if it finds anything.
The point is that even if all code commits are scanned as safe by ai, black hats can still analyse the commits and diffs to find vulnerabilites for people who havent patched yet.
Scanning every commit doesnt automatically make everyone in the world patch immediately, vulns can still be found from commits and diffs and used against those who havent patched yet.
I believe their point was that:
"How likely is this diff a patch for an existing vulnerability?"
Seems to be an easier question to answer than
"Are there any new vulnerabilities introduced by this diff?"
In other words identifying that a patch is for a vulnerability is typically easier than finding the vulnerability in the first place.
If the diff will just be fed to LLMs regardless then what is easier is probably a moot point.
The diff yields the patched code which is used to produce the exploit.
> it'll become common/forced practice to pre-scan code.
You'd think.
But then you'd think people would do a lot of other things too. I hope, I guess.
The other danger is that "the cloud" may become even more overwhelmingly dominant. Which of course has its own large security costs.
Remeber (to you both) extrapolation is a perilous business.
Obligatory xkcd https://xkcd.com/605/
> How do you know?
We know because we could see the effects of the average rate of vulnerabilities discovery and exploitation, and it's definitely going up very fast. Until recently, vulnerabilities were relatively hard to find, and finding them was done by a very restricted group of people world-wide, which made them quite valuable. Not any more.
That's correlation, not causation.
It could equally be argued that the AI slop that's being produced makes for a lot more vulnerabilities being shipped. The bigger target makes for the easier discovery.
But don't we know that some of the vulnerabilities being discovered predate ai coding?
Certainly, and some discoveries have been attributed to AI (I was reading that mozilla firefox were praising mythos recently)
But that's not accounting for all of the discoveries, not at all.
I've also seen the npm people talking about the surge in AI code overwhelming the ability to properly review what's being distributed, and a large number of vulnerabilities being attributed to that
It's likely varies enormously between projects. Linux remains extremely low in slop, and the vulnerabilities being fixed are quite old, so it's improving. Many vibe coded projects are very sloppy, and are adding a lot of vulnerabilities.
Total number of vulnerabilities likely goes up over time weighting all projects equally, but goes down over time weighting by usage.
Is there evidence serious vulnerabilities are the result of vibe coding already? I haven’t seen any so if you have some references, please share.
Security researcher Dor Zvi and his team at the cybersecurity firm he cofounded, RedAccess, analyzed thousands of vibe-coded web applications created using the AI software development tools Lovable, Replit, Base44, and Netlify and found more than 5,000 of them that had virtually no security or authentication of any kind. Many of these web apps allowed anyone who merely finds their web URL to access the apps and their data. Others had only trivial barriers to that access, such as requiring that a visitor sign in with any email address. Around 40 percent of the apps exposed sensitive data, Zvi says, including medical information, financial data, corporate presentations, and strategy documents, as well as detailed logs of customer conversations with chatbots.
https://www.wired.com/story/thousands-of-vibe-coded-apps-exp...
I mean - you're spot on - which is why I'd be more inclined to ask for actual metrics rather than feels/vibes, and I'd be very clear that the information I was basing my thinking on has enormous pitfalls.
This is the basis for "correlation points to possibly fertile grounds for an investigation"
> That's correlation, not causation.
Pragmatically, correlation *is* evidence of causation in favour of the best explanation, until somebody finds a better explanation.
> It could equally be argued that the AI slop that's being produced makes for a lot more vulnerabilities being shipped.
This is also true, and does not exclude the other, because for the moment the vast majority of production software in the world (and therefore the bulk of enticing targets) was written before AI. If LLM software will become prevalent in commercial setups, then LLM-generated code will eventually become the majority of targets.
> Pragmatically, correlation is evidence of causation in favour of the best explanation, until somebody finds a better explanation.
Uh, no.
Correlation is only ever one thing - cause for investigation.
Everything based on correlation alone is speculation.
You can speculate all you like, I have zero issue with that, but that's best prefaced with "I guess"
edit: Science captures this perfectly, and people misunderstand this so fundamentally that there is a massive debate where people who think they are "pro science" argue this so badly with theists that they completely hoist themselves with their own petard.
Science uses the term "theory" because all of our understanding is based on "available data" - and science biggest contribution to humanity is that it accepts that the current/leading THEORY can and will be retracted if there is compelling data discovered that demonstrates a falsehood.
So - because I know this is coming - yes science is willing to accept some correlation - BUT it's labelled "theory" or "statistically significant" because science is clear that if other data arises then that idea will need to be revisited.
Very often you only have limited time for investigation and you have to act now. Action is almost always based on educated guesses.
You have moved from "We know" to "We have an educated guess" which is the right way to couch things.
However I wanted to also point out that relying only on educated guesses can lead us into a position where we are "papering over the cracks" or "addressing the symptoms", not the "underlying cause"
Yes, sometimes that's all that can be done, but, also, sometimes it can be more damaging than the cause itself (thinking in terms of the cause continuing to fester away, whilst we think it's 'solved')
> You have moved from "We know" to "We have an educated guess"
No. You kept blabbering about "science" when most uses of knowledge are not about science. The original topic was also definitely not "science": it was about having a reasonable opinion about whether, empirically, the rate of discovery of vulnerabilities is increasing or not.
Trying to reframe this as 'not science' after being caught on a logical fallacy doesn't change the record. You started with a definitive claim ('We know') to shut down a question. When challenged on the lack of causation, you pivoted to 'educated guesses.'
My point remains: if we misattribute the cause of the rising vulnerability rate (discovery vs. creation), our 'educated guesses' will lead to solutions that address the symptoms while the underlying problem continues to fester. Calling precision 'blabbering' is exactly how we end up with the 'false sense of security' mentioned earlier.
Exhibit A:
ragall 2 hours ago | root | parent | prev | next [–]
> How do you know?
We know because we could see the effects of the average rate of vulnerabilities discovery and exploitation, and it's definitely going up very fast. Until recently, vulnerabilities were relatively hard to find, and finding them was done by a very restricted group of people world-wide, which made them quite valuable. Not any more.
Exhibit B:
ragall 2 hours ago | root | parent | next [–]
Very often you only have limited time for investigation and you have to act now. Action is almost always based on educated guesses. reply
> people were already diffing kernel commits and figuring out which ones were security fixes With skill, and usually not consistently and systematically. With AI, anyone can do this to any software.
I would like to see actual evidence of this, not.. vibes
I mean, this reeks of "Anyone is a Principal developer now" when the truth is there is still work to do.
“White-Hat”
I'm here for white-hate culture. You should, you should know better.
I haven't been keeping tabs for the entirety of Linux development, but has it ever happened before that someone dropped a working exploit from the mailing list before the patch even hit the kernel?
I haven't seen this kind of thing and I get the impression, despite all the hype, that this will be a frequent phenomenon now thanks to LLMs.
> Torvalds said that disclosing the bug itself was enough, without the pursuant circus that followed when a major problem has been discovered. [1]
So it's not surprising Dirtyfrag was disclosed by a fix in the Linux kernel. [2]
[1] https://www.zdnet.com/article/torvalds-criticises-the-securi...
[2] https://afflicted.sh/blog/posts/copy-fail-2.html
I'd say it's an old problem be exacerbated by AI.
I find i’m writing variations of the same comment every week so I’m just going to share a previous version I wrote if you’ll permit the laziness:
https://news.ycombinator.com/item?id=47921829
Reminder: the Ksplice patent expires October 1, 2028.
I don't think hot patching holds the same relevance it did in 2010.
Much of today's workloads are containerized and run on roughly ephemeral nodes that can be switched out easily- K8s version upgrades force this more or less. We tent to run more and more of-the shelf hardware and worry less about individual node failures now.
In-memory updates also not magic , and can be limited as they requires data structure semantics to not really change and can create its own class of issues/bugs including security ones.
While am sure there are still use cases which dictate this type of update, the need is lot less than 15 years ago that the patent expiry will do much to the ecosystem.
What's the implications to that
Means you wouldn't have to reboot to patch for security updates to the Linux kernel. Assuming someone does something with that.
We have a huge problem.
The US is at war. Much of the world is at war at the cyber attack level right now. The US, the EU, most of the Middle East, Israel, Russia... Major services have been attacked and have gone down for days at a time - Ubuntu, Github, Let's Encrypt, Stryker. Entire hospital systems have had to partially shut down.
Now, in the middle of this, AI has made attacks much faster to generate. Faster than the defensive side can respond. Zero-day attacks used to be rare. Now they're normal.
It's going to get worse before it gets better. Maybe much worse.
> before it gets better
How is it going to get better?
If we assume that there will be an AI that is perfect in terms of ability to find vulnerabilities, cheap to run and widely available to everyone, then anyone can run it on any piece of software before deploying it. All vulnerabilities get found before they can be exploited.
One of the big challenges with cybersecurity is that attackers only need to find one exploit, while defenders need to stop everything. When you have a large surface area and limited resources, it's much easier to be the side that only has to succeed once. AI eliminates the limited resources problem.
> If we assume that there will be an AI that is perfect in terms of ability to find vulnerabilities
...so if we assume a halting oracle?
I'd speculate that at this point Linux etc are probably having vulnerabilities discovered and patched faster than created.
It's not only Linux though and many projects don't have the funding to perpetually use something like Mythos.
Right now we are at a point in time when AI can find bugs for attackers and defenders, but defenders did not fix/find those bugs yet.
In time most of the bugs AI can find will be fixed, and things will calm down. Some bugs will be left, but will be too complex to find and weaponise (or rarely).
Alin short, attackers have advantage for a brief time now, but ultimately defenders will win. I guess this "fight" might be over before the end of the year.
1) Make it a law that companies have to vet their code for security holes before release, 2) Make it a law that companies have to apply operational security best practice on their software products/services, 3) Industry standard automation for improvements to patch lifecycle management, 4) Auditing for critical businesses and industries to ensure safety (both as a national security thing and general safety/reliability/privacy/etc)
Right now all that stuff is optional, so most companies don't do it, which makes more security holes and it takes longer to patch.
Basically make software development so legally risky that only multi-billion dollar corporations will ever engage in it.
We could get somewhere where clouds can provide a framework of secure primitives that act as a framework.
E.g. you build an app, it stores data via api etc. etc. You can test in sandbox. The cloud deploys for customer who paid you via that cloud and you work at arms length. You may not even know their name. You just get the pro subscription fees.
The idea bubbling in my head would be an app store for cloud products. But with competition i.e. you use Railway or Heroku or AWS for the best deal.
Be gentle this is an idea in my head I am sure it can be torn down by a retort at this stage. But this exists in forms and I think it will emerge. It is inversion of control at the entire app level.
This is similar to buying a hammer. If you make hammers you sell them to a store, the store knows the customer and only the customer can see the nails.
> This is similar to buying a hammer.
No, it's similar to letting someone else do all your hammering because using a hammer is too dangerous. And then, to make the process more efficient, letting them take control of your home to be able to provide hammering services while making sure you can't touch the hammer.
I guess. It is like a writer letting someone else print the books, maybe?
Legal risk is what insurance is for. You get ensured for a small fee and you go about your job. That's how the non-software world operates anyway
You're assuming the fee would be small. Put yourself in the shoes of an insurance company, deciding what to charge for liability insurance. The potential cost if you have to pay out on the insurance is very very large: depending on the project, software vulnerabilities can cause millions to billions of damage to the economy. And the chance of you having to pay out is a complete unknown.
Unknown chance of having to pay out x large payout amount if you do = very very high premiums. Or not being willing to underwrite the insurance at all.
Remember, insurance is just gambling. The company is betting that the amount of money they'll make from everyone's total premiums added together is greater than the amount they'll have to pay out. Dumb gamblers don't last long. Smart gamblers will evaluate the risk and say "Okay, that'll be $X million a month in premiums", or even "Nope, we won't cover you". Can most open-source projects afford that?
Downplaying security has now real coencequences for everyone.
Bulk rewrites of everything into Rust with AI assistance?
I am looking at the results of a mass vulnerability scan as I type this. Half of the bugs in one case are in fact (binary) parser errors for hand-written parsers. These really should not exist in any language - but in C it's particularly bad. Kaitai Struct or something similar would broadly have prevented these. Rust would help here, but less than a parser generator (because it could automate error checking insertion for things that aren't just out of bound access).
However, half of the vulnerabilities are logic errors in terms of what I would call RBAC enforcement, incorrect access permissions, and so on. Rust won't help at all with any of these.
I was just working on a system best thought of as a “dinosaur”: written almost entirely in C (and a bit of PERL) and running on an appliance with BSD as the kernel.
It’s full of bugs and has had a string of RCE vulnerabilities published recently, probably because of Mythos.
Working with it day to day I get this feeling that the tech stack used results in a system that’s… clumsy and constrained.
Little things give me that impression, and I can’t quite put it in words, but it’s thirty years of experience working with dozens of languages and platforms speaking here.
Using C makes you clumsy.
It makes you trip over things other languages don’t.
It makes it obscenely difficult to do even simple things. It’s like trying to put a delicate ship into a bottle while wearing oven mitts.
Switching to a better language isn’t just about the specific capabilities of its compiler, it’s also about what it enables in the humans using it.
I don't disagree with that, but my point is that Rust will not really solve vulnerabilities.
Rust is overly complex and difficult, Go is simpler and easier and has the memory protection people are obsessed with
> So many security fixes are coming out now that examining commits is much more attractive: the signal-to-noise ratio is higher
Why?
> Additionally, having AI evaluate each commit as it passes is increasingly cheap and effective
This is the key. With AI, the “people won't notice, with so many changes going past” assumption fails.
AI will shorten update windows dramatically. 2026 is the worst year to be thinking about dependency cooldowns, we need to think about dependency warmups instead.
Soon, there will be no such thing as a safe way to disclose a vulnerability in an open source project. Centralized SaaS will have a major security advantage here.
Closed source centralized SaaS will have a major security advantage.
Edit: Because an RCE in a open-source dependency means you are just as vulnerable when the security patch lands? I don’t see the controversy.
You could have a web of trust where Linux-using organizations each spend $x continuously scanning and patching their own dependencies with AI, and sending each other patches and scans.
LLMs aren't capable of doing this, and never will be no matter what Anthropic tries tell you.
That's the same mindset some people had 3 years ago when they said AI wouldn't be capable of software development. Look where we are now.
I have unlimited access to every single frontier model, I've tested all of them, they are not good at writing software.
They are basically slot machines, sometimes you win a little bit and sometimes you win a lot but usually you just burn a ton of time and money sitting and staring at a screen (and frying your brain).
Mozilla seems to think it can.
https://blog.mozilla.org/en/privacy-security/ai-security-zer...
Ahh yes, I'm sure agents did this all autonomously without any human in the loop what so ever. They are useless without experts to handle them.
So then have the Linux-using organizations employ experts to handle them then.
The old saying of Tony Hoare about no obvious bugs vs obviously no bugs holds in the age of LLMs more than ever
> There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies." — Tony Hoare
for those (like me) who hadn't seen it before.
Obviously the solution is for Linux to move to a closed-source development model.
Security researchers should report their findings to a committee that includes some big companies (IBM and Oracle seem like trustworthy choices here, but ideally we should find a way to get Microsoft included). Those companies would apply the security patches and distribute binary builds of Linux to their customers. Users fortunate enough to have a business relationship with those companies would be protected immediately. The source would still be published after 90 days for educational purposes and for anyone who doesn't appreciate the security benefits of this approach.
"But even if you could convince people to collaborate like this for the greater good, the GPL makes it legally impossible", you say. Ah, but the GPL only says you have to make the source available for a minimal monetary cost, it doesn't impose a time limit. Traditionally, responding to source code requests with a snail-mailed CD is good enough. No judge in the US is going to rule that a short administrative delay in sending out those CDs - in the name of everyone's security, after all, and 90 days is nothing to the judicial system - violates a nebulous licensing agreement from a different era.
I like how after so many years, people finally start recognizing that obscurity is a part of security. Not the whole security, obviously, but a part of it.
Just like there's LLM-automated vulnerability fuzzing, there's LLM-automated decompilation. Compilation is no longer a meaningful way to obscure code.
The comment you replied to read like satire to me.
There are already closed source operating systems you can use instead of linux. No need to enshittify linux
The quick test doesn't show a lot - by out straight asking if this is a security patch, it implies and guides AI to have output more probably to agree on this assumption. A confusion matrix is more useful. Nonetheless of course this is not a detailed ai capability testing blog.
[author]
I agree it is not much additional evidence! If someone wanted to try running the same test on a series of N commits from that list including this one I'd be very curious to see the answer!
Realistically, if you are scanning each kernel commit to check if they might be patching a security issue, you are going to be asking an LLM "is this security related, if so vaguely how" with low effort and taking "maybe" as a yes before feeding it to a more expensive model. You aren't trying to establish a probability of an ultimately unknowable fact, there is ground truth that you can find by producing an exploit, so you are just trying to pre-filter before spending the money to find it.
Yeah, ideally we would need the phi coefficient (aka MCC, the binary Pearson correlation), which can be calculated from a confusion matrix of yes/no LLM classifications for all kernel diffs. (Number of true positives, true negatives, false positives, false negatives.)
> Luckily AI can speed up defenders as well as attackers here, allowing embargoes that would previously have been uselessly short.
This is an important facet of the problem space: security risks turning into an arms race for who wants to spend more tokens.
One interesting thing is that this makes closed source code even greater asset for the defenders. Attacker cannot spend tokens for it, but defenders can spend tokens for hardening based on source code, while attacker is stuck with blackbox testing.
You would be surprised how adept SOTA models are at reverse engineering with IDA/Ghidra or even plain old objdump. Opus basically knows IDAPython on the back of its hand.
They can be, but the most interesting parts (backend code, deployment confs) are not usually available. Reversing clients can help to understand a bit, but not with equal level.
On the other hand, any source code leak could be catastrophic
Decompilation is quite good these days as well
Reverse engineering vulnerabilities from patches is red team 101...
The bugs are bugs description reads pretty insane to me personally but I know linux world has many people valueing principle of it over practical matters.
90d seems long too though.
Think ultimately the big AI houses will need to help the core internet infra guys. Running latest and greatest AI over stuff like nginx and friends makes sense for us all collectively I think
We need automated patch and release cycles. So far we've relied on incredibly slow manual processes to accept reports, investigate, verify, patch, and prepare releases. Releasing a fix often takes months. This is way too slow when attackers can just churn out new exploits in hours. We need to iterate on value chain bottlenecks to lower Mean Time To Patch.
We should be able to turn around a bug report to a patched product ready for QA testing in 1 hour. Standardize/open source it, have the whole software supply chain use it (ex. Linux kernel -> distros -> products that use distros -> users). With AI there's no reason we can't do this, we're just slow.
On the other hand, automated fast rollouts leads to a crowdstrike type situation where you brick all the computers of the world immediately.
Imo we are going to have to rely on more layers of security. Systems that are designed to be secure even in the presence of individual vulnerabilities. This has already been happening for a while on mobile platforms and game consoles. Even physical hardware designed to keep particular secrets /keys even from the kernel.
The crowdstrike situation wasn't due to fast rollouts, it was due to a total lack of testing. You can do fast rollouts, with testing, and a mandatory QA signoff. It's called 'continuous delivery' rather than 'continuous deployment'.
I actually don't think more layers of security will fix this. It would be nice if our systems were more secure... but people are, if nothing else, lazy af. Even when adding security isn't a lot of work, people resist it if it "sounds complicated". So I think we're stuck with the status quo. But the big issue now isn't novel bug types, it's the speed in which they're found. Therefore we need to speed up our response.
Lack of care, slip ups, and bugs are basically a constant that will always exist. But we can architect systems which are secure even in the face of bugs. Multiple layers of security mean that even the most critical kernel bug in iOS can never extract your faceid data or encryption key because the hardware physically isn’t capable of it. OSs like Qubes utilise multiple VMs so any kernel bugs have limited reach.
When you look at consoles, they have built software that is resistant to outright glitching the CPU.
Sounds like you're expecting the AI-based tools that are finding bugs to also provide fixes.
I've been dealing with a bunch of AI-generated (or at least -assisted) vulnerability reports lately. In many cases the reports include proposed patches to fix the issues.
It's been..... interesting. In many cases, the analysis provided in the report has been accurate and helpful. In some cases, the proposed patches have also been good, and we've accepted them with minimal or no changes.
In other cases, despite finding a valid issue, and even providing a good analysis of the problem, the AI tool's suggested patch has been, quite simply, wrong.
Careful review from somebody who really _understands_ the code -- and the wider context in which it is operating -- is still absolutely necessary. That's not always going to happen in an hour.
Yes, that's why I specified "patched product ready for QA testing". It speeds up the development cycle by making a first pass and ensuring it basically works before passing it to a developer for manual review and a QA tester to ensure the fix doesn't break anything else. Both dev and QA are still in the feedback loop and can make changes until it's ready for release
what could go wrong? :DDD
imagine patching everything up automatically and it's a malware
everything cooked
A 3rd culture - the "security though obscurity" culture where some random little library might be a potential weak link, but will anyone really bother to hack it?
Not as worrysome in a philosophical way (since it's not a serious culture) but it's a real issue. And just wait for a nation state to start astroturing helpful little libraries at scale ...
It sounds to me like the safe assumption with software is that no matter how solid your stack is, there are vulnerabilities, potentially catastrophic. A question to folks more experienced than me - if my business depends on software, and I know that my software is almost certainly exploitable, how do I posture my business in such a way as to minimize the impacts of exploits like these?
When Windows was the predominant desktop OS in the 90s and maybe early 00s (ok, maybe still is), it was so badly insecure that you could be pretty much sure that it would be easy to compromise.
That's when firewalls were widely deployed to provide some layer of protection.
So you can ask yourself, what is the (possibly metaphorical) firewall in the software you depend on?
Is there any way you can decrease attack surface, separate out the most important data in extra-secure (and thus less accessible) systems?
I must admit I'm rather enjoying this particular form of shit show, mostly because it was a predication I made in 2023 in the early days of LLMs. It wasn't really a problem related to LLMs but a glaring hole in the thinking of current computing which is the "frustratingly over-connected" and "over-trust" approach to everything. After reading Liu Cixin's "three body problem" and noting the Dark Forest, I applied that to risk vectors and came to the conclusion that our over-connected nature plus some form of acceleration plus some form of negative impact will fuck us big time.
Turns out it did.
Thus we should probably start treating our thinking model of computing as a Dark Forest, not a friendly community. That mitigates these risks to some degree.
If you're into gaming, Cyberpunk 2077 essentially plays in a heavily technologized world, where all compute infrastructure is infested with rogue AI that replicates itself to any technology it can physically get in touch with. The only recourse is a new web, built from first principles, protected by (probably) benevolent AI systems. Every device, every server, is partially occupied by AIs doing their thing on it, virtually networked into a digital universe. I found that a fascinating thought.
I'd argue it's actually breaking three vulnerability cultures. In addition to the two Jeff mentions, I think the culture of delaying upgrades and staying on stable versions for as long as possible is going to become increasingly untenable, if everything that's not latest can be trivially scanned and exploited. In the extreme I think there's a decent chance projects like Debian might have to radically overhaul or just shut down completely - the whole philosophy of slow and steady with old code just won't work.
There will be much wailing and gnashing of teeth around this, because a lot of tech types really resent having to update constantly, but I don't think people will have a choice. If you have a complicated stack where major or even minor version updates are a huge hassle, I'd start working now to try and clear out the cruft and grease those wheels.
> there's a decent chance projects like Debian might have to radically overhaul or just shut down completely - the whole philosophy of slow and steady with old code just won't work.
Debian continuously issues security updates for stable versions, ingestable with automatic updates. “Stable” doesn’t mean that vulnerabilities aren’t getting fixed.
The argument that could be made is that keeping up with getting vulnerabilities fixed might become such a high workload that fewer releases can be maintained in parallel, and therefore the lifetime and/or overlap of maintained releases would have to be reduced. But the argument for abandoning stable releases altogether doesn’t seem cogent.
It goes both ways: Stable code that only receives security updates becomes less vulnerable over time, as the likelihood of new vulnerabilities being introduced is comparatively low. From that point of view, stable software actually has a leg up over continuous (“eternal beta” in the worst case) functional updates.
I can only dream, but this may re-popularize (among the rest of the non-Debian software industry) the general best practice of keeping a "sustaining" branch green, buildable, and with frequent releases, for security fixes.
I hate software that forces you to take new features as a condition of obtaining bug and security fixes. We need to keep old "stable" builds around for longer and maintain them better. I know, I know, it is really upsetting to developers to have to backport things to old versions--they wish that all they had to work on was the current branch. But that just causes guys like me to never upgrade because the downside of upgrading (new features) is worse than the upside (security fixes).
> In the extreme I think there's a decent chance projects like Debian might have to radically overhaul or just shut down completely - the whole philosophy of slow and steady with old code just won't work.
It may actually be the opposite.
Debians steady and professional approach on shipping security patches with very little to no functional difference actually enables us to consider and work on automated, autonomous weekly or faster patches of the entire fleet. And once that's in place and trusted, emergency rollouts are very possible and easy.
We have other projects that "move fast and break things" and ship whatever they want in whatever versions they want and those will require constant attention to ship any update for a security topic. These projects require constant human attention to work through their shenanigans to keep them up to date.
Not only that but debian has for example, debsecan so you can see on any system what CVEs exist and if your packages are patched. ex from my system I ran it and got
> CVE-2026-32105 xrdp
which i see has a fix in sid but not on bookworm
That's not really the culture of debian to be honest. Yes they run old major and minor versions, but they do ship patch updates as fast as they can. Even on debian stable, you absolutely are supposed to update all the time. The culture of "just don't touch it" is a different one (but also exists, I've seen it).
Debian has updated kernel packages out for the stable release. https://security-tracker.debian.org/tracker/CVE-2026-43284
I kind of get your point, but they responded pretty quickly here.
Oh yeah, to be clear: Debian has always been good about quickly shipping patches to kernel vulnerabilities, and they will continue to be so. I was more thinking about whether they will get overwhelmed if every bit of software they package just has a firehose of vulnerabilities on everything which isn't latest.
We are now paying for the sins of our fathers (well and mostly ourselves).
We've just kept building more complex things with more exposure with no recognition that the day of reckoning was coming. And now we are in an untenable situation. With governments spending billions on AI with the big providers it's likely they've found many of these already.
Yep. This is why I am using local AI to edit and build my own copies of Linux kernel, Wayland... everything a distribution would ship really.
Not so daunting for me having come of age when compiling a kernel specific to a hardware platform was essential.
Custom software that does not fit the usual patterns is not fool proof but it won't be obvious.
Monocultures with all their eggs in one basket are even less secure than truly diverse ecosystems though.
Arch Linux to become the only Linux OS left.
> On the other side you have "bugs are bugs" culture. This is especially common in Linux, where the argument is that if the kernel is doing something it shouldn't then someone somewhere may be able to turn it into an attack. Just fix things as quickly as possible, without drawing attention to them. Often people won't notice, with so many changes going past, and there's still time to get machines patched.
The 3rd one is what I practice when giving companies time to fix their issue. Note, I haven't reported anything to FOSS projects, but to several companies I found exploits in. I give them 5 days. If they don't respond at all in the first day, I deduct 1 day - apparently they're either incompetent or don't care. After the 5 days have passed, I make it public. So far they've all fixed the issue on the 3rd or 4th day.
If I were to report something to a FOSS project, I'd give them a bit more, say 8-9 days. Enough time for everyone to wake up, review the vuln, patch it and ship it. Enough time for all the downstream projects to also ship the patch.
90 days is ridiculous, especially for companies. If I report something on Friday 23:30 and they reply Monday 15:00 - what were they doing during the weekend? Did they forget their software is used 24/7? I had one company complain quite a bit, threatening to sue. When they realized there was no one to sue (me being anonymous with my report), they fixed it in less than a day.
Bottom line - if you're a company offering a product or service, you should have a security team 24/7.
If you're a FOSS project - either alert your users to stop using your software or disable the service yourself, if you can.
If it's an extremely important life-or-death service you can't shut off - then fix it quickly. What are you doing with life-or-death stuff when you can't react quickly enough?
Fuck the 90 days standard - it's what companies want us to do because it's easier on them. If security hasn't been your top priority, you have a few days to make it your top priority.
With AI, that makes even more sense now. Bugs won't be able to stay hidden for months. Especially bugs I've reported like IDORs or SQL injections - things everyone tries first.
(and I love Linux, but getting an "Oh noes!" from Anubis at kernel.org because I don't have cookies enabled (I do??) really makes me not want to report anything to the Linux kernel in particular. If I ever did find something, I'd just immediately post it as a HN comment or something like that)
> 90 days is ridiculous, especially for companies
It depends on the kind of vulnerability, but sometimes in order to fix a problem, you need to do an enormous amount of software engineering. Which needs to be done to a very high standard, because the expectation is that people will push security patches more or less immediately to production.
Of course, this only works if no one else is likely to discover the vulnerability in the meantime!
The company can almost always shut down their service until they fix it. They'll lose money and their customers could also lose money if they depend on the service. That's the price they'll have to pay. Otherwise, they should either work frantically 24/7 to fix the vuln or if they can't, they should accept the fact that they've pushed code without any regard for security and bear the consequences.
Why do we need to put up with excuses? If a company has lots of complicated code that would need enormous amount of time to fix, it's on them. They decided to release this code into the wild.
If I publish the vuln publicly, the users would have the option to stop using the software/service until it's patched. If a customer is using a service without caring about security, it's on them. I want to protect the customers who would monitor the news for such vulns and protect themselves.
How would you apply this logic to something like https://meltdownattack.com ? The vulnerability was in hardware, discovered by companies that make user level software, and mitigated by changes to OS kernels.
Maybe it is about time for Linux to get a real CD/CI and start using AI extensively.
Not just for vulnerabilities, having a nice agents|skills|etc.md definitions would encourage new devs to contribute instead of dealing with an overworked maintener repeating the same thing for n time.