The Safest AGI Keeps Humans Inside It

Most plans bolt human values onto a finished machine, and the safer design builds the machine around people from the start.

Jul 01, 2026

On September 26, 1983, a Soviet early-warning system reported that the United States had launched a nuclear missile, then reported five more behind it.

The duty officer that night was Lieutenant Colonel Stanislav Petrov. Protocol told him to send the warning up the chain, which would have triggered a retaliatory launch. He judged it a false alarm and held back. He was right.

The satellite system had malfunctioned, and a man who trusted his own reasoning over the machine is credited with preventing a nuclear war.

Petrov was never honored for it. Recognizing him would have drawn attention to the fact that the Soviet warning system was defective. He received a pension and a quiet retirement. The lesson outlived the silence. A human inside the loop, weighing what the machine could not, kept a faulty automated system from ending millions of lives.

AGI is more dangerous than nuclear weapons, by a wide margin. If a human in the loop saved us from our most dangerous technology once, the design of any more dangerous technology should keep humans in the loop on purpose. Most AGI plans do the opposite.

Most researchers cannot picture how a human stays in the loop of a system that thinks far faster than any person can keep up with. They expect AGI to arrive as an Uber-LLM that trains itself, rewrites its own code, and reasons millions of times faster than a human mind. A person watching that system could never keep pace. So the human gets moved outside, recast as an overseer who reviews the output after the fact and hopes to catch problems before they spread. Most researchers see no alternative. An alternative exists.

The order of construction is backward.
The typical plan is to build a powerful machine first, then add human values and safety afterward to head off the alignment problem.
The better approach starts with people.
Build a human collective intelligence that solves problems together, then bring AI into it piece by piece and train it on the values of the humans already inside.

Two plans as concentric rings; human values are a cracked outer shell on one, the warm core on the other — Build the system first, and values are a brittle outer shell; start with people, and values are the core!

Run that forward, and the system changes character without ever losing the humans. Early on, people do most of the thinking. Over time, the AI does more, until the combined system is far more capable than any group of people. The humans never leave. Because the AI learned its values from them while they worked side by side, those values become part of the system. They are not applied as a final layer. Human ethics are built into its makeup from the first day.

This design has a second advantage that matters as much as safety. It can be built now, from people and the AI models that already exist. It does not wait on a future Uber-LLM that may or may not arrive. A system that can be built immediately can be first, and being first is what decides which design the world locks in.

A network of ordinary people, working together, can match or beat the most capable individuals at hard problems. The next post takes up the evidence, drawn from a company I built where the combined judgment of millions of everyday investors outperformed most professionals on Wall Street.

Discussion about this post

Ready for more?