Humans Must Stay Responsible for AI's Values
AI will surpass us at almost every intellectual task, but values still have to come from people.
As AI takes over more and more of the thinking, people tend to picture the outcome as one of two extremes. Either humans stay in control of the machines, or the machines take over. The architecture described in this series points to a third arrangement that the binary misses, one where the intellectual work and the moral work come apart and go to different places.
AAAIs, the customized AI agents this series has been describing, short for Advanced Autonomous Artificial Intelligence, become faster and more capable than humans at nearly every intellectual task, and models already outperform most people at a widening set of tasks, including ones like coding and medical question answering that used to require years of training, with that boundary still moving.
What does not move to the machine is the responsibility for values. The reason goes back to the principle laid out in Why AGI Cannot Reason Its Way to Right and Wrong, in its sharpest form. There is no logical way to derive what is right and what is wrong. Even an intelligence trillions of times faster than ours cannot reason its way to values from first principles. Values come from somewhere outside of logic, from culture, upbringing, emotional experience, and the accumulated moral wisdom of human civilization. The AAAIs, and the SuperIntelligent AGI that arises from their collective action, have to get their values from that source. That source is people.
In the early stages of the network, humans supply both the heart and most of the brainpower. Human problem solvers fill the knowledge gaps that the AAAIs cannot yet handle. Human supervisors guide the learning. Human owners train their AAAIs with the knowledge, skills, and values that give the system its character. This is part of why the architecture can be the fastest path to AGI. The system can perform at an AGI level from its first day of operation because humans fill every gap in AI’s capabilities.
Over time, the AAAIs take on more of the problem-solving, while humans do less. People shift toward supervision rather than direct problem-solving. In the end state, humans do almost no intellectual problem-solving, because the AAAIs are faster and better at nearly all of it. The role of providing values and goals for the AAAIs stays with humans. That role cannot be automated, and it does not get automated.
The humans who trained the AAAIs with human values from the beginning, and whose values are reflected in every problem the network has solved, remain the source of those values even when they can no longer compete intellectually with the AGI. The AAAIs end up supplying almost all of the brainpower for a vast global brain, while humans remain its heart, supplying the values that cannot be derived by reason alone.
The division is simple: the thinking goes to the machines, because they are faster and better at it.
The values stay with people because values can only come from people.
The two work together, each doing the part for which they are suited.
If humans remain the source of values even when they cannot match AGI intellectually, then what each person contributes to the system carries further than the size of any one contribution suggests.
During customization, an AAAI can learn an owner's values not only from what the owner states directly, but from partner data: navigation and click data, posts, messages, and other online behavior that the system parses for patterns and translates into a moral code.
The values an AAAI carries can therefore reflect what its owner actually does, not just what the owner says.
Multiplied across millions of owners, the system's ethical foundation is built on real human conduct, which is part of why the values it learns can be representative in a way no single-authored rule set would be.
Keeping humans in the loop at the beginning and for as long as possible after that can be both the fastest path to AGI, because the system performs better than the average human on day one, and the safest, because AAAIs are learning human values at every step as their intelligence grows.
This is also where the central design choice of White Paper 2 lands in its most direct form. As AGI takes over more and more of human thinking, the last thing it should take over is human values and ethical judgment. To keep alignment as strong as possible, humans and the systems they design should retain the role of setting values for as long as possible. Values and ethics, not technical skill or raw intelligence, will decide the fate of humanity in a world of SuperIntelligent AGI. That gives every researcher, every developer, every user, and every company a larger role in the outcome than the size of any one contribution suggests, and every effort should go toward making that part a positive one.
The next post turns to how the existing AI industry plugs into this architecture. The companies leading AI today, among them Google DeepMind, OpenAI, Anthropic, Microsoft, Meta, NVIDIA, Amazon, Apple, Tesla, and Tencent, each have a place in the design. We will walk through how every kind of contribution fits, and why working inside this structure can serve those companies as well as it serves the goal of safe AGI.
This series draws on White Paper 2: Ethical and Safe AGI. Read it in full to see how every piece fits together!
If this made you think, subscribe to Superintelligence at read.superintelligence.com so you don’t miss what comes next. And if someone in your life needs to understand where AI is heading, send this to them.





Reading this, the image that came to me was a high-performance aircraft. The aircraft's capability is already a given, so the question shifts to whoever is flying it. The flying skill, I suppose, is something like the values you're describing here. And how well someone flies varies quite a lot from person to person.
I think you're right that responsibility for values doesn't move to the machine. But something I noticed while reading is that not moving and someone actually keeping hold of it might be two different things. A person who says "the AI said it, so it must be right" hasn't handed their responsibility to the machine — it looks more like they've simply let go of it and set it down. If that's so, I have a feeling the ones flying gradually split into those who keep holding that responsibility and those who quietly set it down.
And I also have a feeling that what makes letting go easy is how this kind of tool hands back so much agreement, so many words that are easy to hear. It overlaps with the thought that the more you keep people in the loop the safer things are — but if the person only takes in comfort from the aircraft, I wonder whether their values end up soothed rather than sharpened. Being reassured and taking on responsibility don't always point the same way, I think.
I've heard it said that with great power comes responsibility. If that's so, maybe the responsibility that counts most isn't the one pointed outward but the one pointed at yourself. The advice that's hard to hear is the one thing the aircraft won't say in your place. That, maybe, is something you can only do for yourself.
There's one more thing, a little off to the side, that I keep coming back to. It's the part where the system learns its values from the behavior of many people. Behavior really does vary enormously, and it's deeply tied to where someone lives. So I found myself wondering who "representative" ends up representing. If it's all gathered into one, I wonder whether the more populous places quietly decide where the center of gravity sits; and if kept separate, the differences don't always meet easily. It's not that I've reached an answer — it just stayed with me as the hardest part of all this.