Notes on…The Part I got Wrong
Earlier this year I wrote that AI safety was too model-centric, too focused on the manufacturer while missing the deployment ecosystem. I still believe that. But I've kept reading (and listening, and honestly sometimes just doom-spiraling into very good podcasts at 11pm), and somewhere in that process I looked up and realized the box of ideas I'd been working with was too small to hold what I was actually looking at.
I forget which podcast it was, a researcher was talking about how long it took AI to go from not being able to compete with PhD's in a particular domain, to outthinking them. A change that happened within months. And if you want stark stats, in 2024, AI went from solving 4.4% of real-world software problems to 71.7% in a single year. (Stanford HAI AI Index, 2025).
And then there’s Mythos, which is a full-on capability upgrade for anyone who wants to exploit the gap between builders and operators. We're talking about an AI that can identify vulnerabilities, daisy-chain them into attack paths we haven't mapped, and exploit them autonomously. We’re talking about threat actors who already know our infrastructure, who have been sitting in our enterprise networks longer than we'd like to admit, now have something that can take all of that knowledge and act on it faster than any human analyst can respond. That's not a future scenario, that’s right now.
I've given enough threat briefings to know what a capability curve that steep means. It means the people who are still calibrating to last year's baseline are already behind. I've spent years being the person in the room saying: you think you know where the threat is, but the threat has already lapped you. I've watched that be true, repeatedly, in critical infrastructure: the gap between when something enters your environment and when you understand what it's capable of is rarely in your favor. So I should have recognized the pattern faster.
If we're still building our threat models and governance frameworks around what AI could do or what it might be even a year from now, we've already lost the plot.
I was thinking about AI the way I think about every other technology I've worked with: as a thing we load into an environment, configure, govern, and manage. A device. A powerful, novel, complicated device, but still fundamentally a thing that humans operate.
Every technology I've ever governed is neutral. A device doesn't have a disposition. It executes. It fails. It gets compromised by accident or external human actors with bad intentions. That assumption is baked into every safety and security framework I've ever worked with, because it was always true.
AI is the first technology where I have to ask whether that's still true. I think my exact thought was “oh hell, I didn’t even think about what happens if the tool itself is trying to kill you”. And yes, I know AI isn't trying to kill us (YET. Insert ominous music). But we don't know why it learns the way it does. We don't know when it outpaces us in ways that actually matter. We don't even know if the assumption that humans stay in the loop is going to hold. That's not a fringe position, that’s the honest state of the field. And no framework we've built for governing high-consequence systems has had to answer that question.
AI is already making decisions, not just executing them. Maybe not in the control layer yet, but in the data and analytical layers, the place where by the time a human "decides" they're often just ratifying what the system already concluded. And every step that moves humans further from that loop makes the next step easier to justify and harder to reverse.
Which brings me to the thing I got most wrong. I was worried about the deployment ecosystem and who governs AI once it leaves the lab. That's still the right worry. But I was treating it as a structural problem with a long runway. It isn't. And I’'m not sure the right people are even together in the same rooms yet to get in front of the problems I’m concerned about. The people building AI are not infrastructure people. And infrastructure people — cautious, consequence-aware, with long institutional memory about what happens when things go wrong — are not AI people. The OT world took decades to build even incomplete governance frameworks, and we're still living with decisions made before those frameworks existed. AI is moving faster than our ability to govern it, and the system we're trying to govern might not stay neutral while we figure it out.
I don't have a tidy answer to any of this. That's kind of the point. The framework of assumptions I was working inside felt solid. It doesn't feel that way anymore. In fiction there's always a hero who shows up in time. We're inside the transition now. There's no hero coming, it’s just us. And I'm pretty sure "we have time to get there" is the wrong assumption.
More on that soon.