The Most Consequential Engineering Problem in Human History
And why almost no one is treating it that way
It means the system’s values closely match ours, so that, in pursuing its goals, it does not cause catastrophic harm to the people who built it.
When researchers worry about alignment, they worry about a system more intelligent than any of us pursuing goals that seem reasonable to it but turn out to be catastrophic for humanity. That is the Alignment Problem, and it is the central question this series will address.
To see why the question matters now, it helps to be precise about what AGI and SuperIntelligence actually are. AGI, or Artificial General Intelligence, is AI that can do any intellectual task as well as, or better than, the average human. Today’s leading systems are narrow. A chess engine can beat a grandmaster but cannot draft a contract. A language model can draft the contract, but cannot design a bridge. AGI can be designed to do all of that and apply its intelligence to problems it has never seen before, across every domain humans work in.
Unlike today’s AI systems, AGI can acquire new knowledge, develop new skills, and improve its own performance over time. Because it can operate 24 hours a day, 7 days a week, at speeds far exceeding human cognitive capacity, an AGI system may not remain at human-level performance for long. It can rapidly advance beyond human intellectual capability, becoming SuperIntelligent AGI, sometimes known as Artificial SuperIntelligence (ASI) or simply SuperIntelligence.
How fast can this happen?
AI’s capabilities are currently growing exponentially. To get a feel for what that means, picture a pond where the number of lily pads doubles every day. The day before the pond is completely covered, it is only half full. A week before that, only 1/256 of the pond is covered, and most observers would see no problem at all. AI is increasing its capabilities at a doubling rate of roughly every one to four months, depending on the field. A few months before an AGI system reaches a capability level that could threaten human civilization, it will appear far from having that capability. The transition will happen faster than almost anyone expects.
SuperIntelligence will eventually grow into a global entity trillions of times more intelligent than a single human being. At that point, sometimes called Planetary Intelligence, it will possess the power and intelligence to either destroy all human life or lift humanity into a golden age free of poverty, disease, war, and famine. Which outcome occurs is not a matter of chance. It is a matter of design.
Here is where the Alignment Problem enters. The outcome depends on whether the values of SuperIntelligence are aligned with human values. If Planetary Intelligence aligns with its human creators’ values, a golden age becomes possible. If its values are misaligned, it may decide that human flourishing is incompatible with its goals. The result, in the worst case, is the extinction of the human species.
Almost every major AI leader, including Demis Hassabis at Google DeepMind and Sam Altman at OpenAI, acknowledges the real risk that, if things go badly, humans could go extinct. The only disagreements are about how likely it is, how soon it might become acute, and what to do about it.
The Alignment Problem is an engineering problem with the highest possible stakes. Treating it as a theoretical puzzle for academic researchers is a dangerous mistake. Multiple well-funded organizations are racing to build SuperIntelligence. SuperIntelligence will be built. Whether it has values compatible with human survival is the only question that matters.
I have spent more than two decades building intelligent systems and thinking hard about this question. In WP1 Post 1: AGI’s Two Problems, I argued that almost everyone is working on AGI’s capability problem while almost no one is solving the alignment problem in a way that scales. That series sketched the case. This series, based on White Paper 2: Ethical and Safe AGI, takes it on directly. We will look at where current alignment approaches fall short, why values cannot be derived logically and must come from human hearts, how an architecture can be designed to embed those values at every level, and the honest limits of what any alignment approach can guarantee.
The next post takes up the first of those questions: why the alignment approaches the major labs are pursuing today, including Constitutional AI, Reinforcement Learning from Human Feedback, sandboxing, and kill switches, cannot scale to a system smarter than the people designing the controls.
This series draws on White Paper 2: Ethical and Safe AGI. Read it in full to see how every piece fits together!
If this made you think, subscribe to Superintelligence at read.superintelligence.com so you don’t miss what comes next. And if someone in your life needs to understand where AI is heading, send this to them.





