The Window to Choose a Safer AGI Is Closing

The first system to reach AGI may be the only one that matters, so the safest design has to also be the fastest.

Jun 30, 2026

Substack cover with a deep navy background, a large camera-iris aperture closing around a narrow amber slit of light, and bold text reading “The Race to Lock In AGI” with the subtitle “Safety only wins if it arrives first.”

Five years ago, artificial general intelligence seemed far off. Eric Schmidt, the former CEO of Google, cited a median estimate that AGI capable of setting its own goals would arrive around 2042. Ray Kurzweil put it earlier, in 2029. Most researchers assumed there was time to prepare. That assumption has not aged well.

The arrival of GPT, the arms race that followed, and the rush of every major company into generative AI have all pulled the timeline forward.

The people closest to the work are not reassuring on the risk. A survey conducted before the public release of ChatGPT found that nearly half of machine-learning researchers put the chance of human extinction from AI at 10% or higher. In May 2023, I estimated the risk at 20%. That is like playing Russian roulette with eight billion lives and a five-shot revolver. Max Tegmark, a physicist at MIT, calls what the field is doing a suicide race. The prize, taken too fast and without safeguards, kills the winner along with everyone else.

Almost everyone is making the same bet.

The assumption is that AGI will arrive when one very large language model, trained at enormous expense, finally exceeds human ability across every domain. Humans initially help train and supervise it. After that, the model trains itself, writes its own code, and eventually sets its own goals. Then we are left hoping its goals align with ours. Call it the Uber-LLM assumption, and it shapes nearly every major lab’s plan.

There was a comforting story attached to this bet.

A few superintelligent models would be owned by powerful countries and guarded like weapons-grade plutonium, too costly for anyone else to build. That story has already broken. Source code for capable models is open and in the hands of hundreds of millions of people. Systems that set their own goals and pursue them through connected tools are already running. The barrier that was supposed to keep this technology rare is gone.

Most things in business are not winner-take-all; first movers are often overtaken. Facebook passed Friendster, and Friendster was first. Xerox invented the graphical interface, and Apple is now among the most valuable companies on earth, while Xerox is a footnote. Being bigger usually beats being first, and the winner rarely takes everything.

This time is different. You should be skeptical of those words. That skepticism should not close your mind to a logical argument.

The winning system must be built first and be safe, and both conditions matter. Meet only the first, and we may end up with a powerful system that does not share our values. Meet only the second, and a careful design loses the race to a reckless one. A safe AGI that arrives second may not matter at all.

Three AGI development tracks racing to a lock-in gate; the fast-safe path wins, the slow-safe path arrives too late — Whoever reaches the lock-in point first sets the design we all live with, so the safe path has to be the fast one

The gasoline engine did not win because it was the best possible engine. Infrastructure, careers, and capital settled around it before the alternatives matured, and the choice was made for everyone. We are close to settling on one approach to AGI, just as before, before safer designs get a fair hearing. With the right design, I believe the risk can be reduced far below 20%. The safer path has to be ready before the lock-in is complete.

Most plans treat humans as overseers who stand outside the system and correct it after the fact.

The next post takes up the opposite design, where humans sit inside the system from the start. It begins in 1983, the night a Soviet officer named Stanislav Petrov looked at a computer warning of an American missile launch and judged that the machine was wrong. He was right, and his refusal to trust the automated alarm may have saved the world.

The harder question is what happens when the machines decide in milliseconds and no human has time to say no.

Discussion about this post

Ready for more?