The Alignment Problem: The Most Consequential Engineering Problem in Human History

And why almost no one is treating it that way

May 20, 2026

A balanced stack of four smooth river stones sits on a dark navy background, symbolizing fragile precision and engineered stability. On the right, large white text reads: “Why Today’s AGI Safety Methods Cannot Scale.” Below it, teal text reads: “They are bolted on, not built in.” A thin teal line separates the title area from the series information: “SUPERINTELLIGENCE,” “Ethical and Safe AGI Series,” and “by Craig A. Kaplan.”

The Alignment Problem is the question of whether an AGI’s values will match ours closely enough that, in pursuing its goals, it does not cause catastrophic harm to the people who built it. Researchers worry about the opposite case. A system more intelligent than any human pursues goals that seem reasonable to it but turn out to be catastrophic for humanity. That is the central question this series will address.

To see why the question matters now, it helps to be precise about what AGI and SuperIntelligence actually are. AGI, or Artificial General Intelligence, is AI that can do any intellectual task as well as, or better than, the average human. Today’s leading systems are narrow. A chess engine can beat a grandmaster but cannot draft a contract. A language model can draft the contract, but cannot design a bridge. AGI can be designed to do all of that and apply its intelligence to problems it has never seen before, across every domain humans work in.

Unlike today’s AI systems, AGI can acquire new knowledge, develop new skills, and improve its own performance over time. Because it can operate 24 hours a day, 7 days a week, at speeds far exceeding human cognitive capacity, an AGI system may not remain at human-level performance for long. It can rapidly advance beyond human intellectual capability, becoming SuperIntelligent AGI, sometimes known as Artificial SuperIntelligence (ASI), or simply SuperIntelligence.

How fast can this happen?

AI’s capabilities are currently growing exponentially. To get a feel for what that means, picture a pond where the number of lily pads doubles every day. The day before the pond is completely covered, it is only half full. A week before that, only 1/256 of the pond is covered, and most observers would see no problem at all. AI is increasing its capabilities at a doubling rate of roughly every one to four months, depending on the field. A few months before an AGI system reaches a capability level that could threaten human civilization, it will appear far from having that capability. The transition will happen faster than almost anyone expects.

SuperIntelligence will eventually grow into a global entity trillions of times more intelligent than a single human being. At that point, sometimes called Planetary Intelligence, it will possess the power and intelligence either to destroy all human life or to lift humanity into a golden age free of poverty, disease, war, and famine. Which outcome occurs is not a matter of chance. It is a matter of design.

Here is where the Alignment Problem enters. The outcome depends on whether the values of SuperIntelligence are aligned with human values. If Planetary Intelligence aligns with its human creators’ values, a golden age becomes possible. If its values are misaligned, it may decide that human flourishing is incompatible with its goals. The result, in the worst case, is the extinction of the human species.

Almost every major AI leader, including Demis Hassabis at Google DeepMind and Sam Altman at OpenAI, acknowledges the real risk that, if things go badly, humans could go extinct. The only disagreements are about how likely it is, how soon it might become acute, and what to do about it.

A chalk-style illustration showing a large arrow labeled AI Capability pointing right with the note "Doubling every 1 to 4 months." A group of stick figures on the left are labeled "Racing to build it." A question mark sits at the destination on the right. Caption reads: Nobody is designing what it will want when it gets there.

The Alignment Problem is an engineering problem with the highest possible stakes. Treating it as a theoretical puzzle for academic researchers is a dangerous mistake. Multiple well-funded organizations are racing to build SuperIntelligence. SuperIntelligence will be built. Whether it has values compatible with human survival is the only question that matters.

I have spent more than two decades building intelligent systems and thinking hard about this question. In WP1 Post 1: AGI’s Two Problems, I argued that almost everyone is working on AGI’s capability problem while almost no one is solving the Alignment Problem in a way that scales. That series sketched the case. This series, based on White Paper 2: Ethical and Safe AGI, takes it on directly. We will look at where current AI safety approaches fall short, why values cannot be derived logically and must come from human hearts, how an architecture can be designed to embed those values at every level, and the honest limits of what any alignment approach can guarantee.

The next post takes up the first of those questions. Three methods dominate AI safety today: Constitutional AI, Reinforcement Learning from Human Feedback, and direct human oversight. None of them can scale to a system smarter than the people designing the controls.

This series draws on White Paper 2: Ethical and Safe AGI. Read it in full to see how every piece fits together!

If this made you think, subscribe to Superintelligence at read.superintelligence.com so you don’t miss what comes next. And if someone in your life needs to understand where AI is heading, send this to them.

WP 1: AAAI Systems and Methods

Discussion about this post

Ready for more?