← Back to Insights

After AI Safety Certification: Why Model Safety Does Not Solve the Authority Problem

Anthropic’s proposal for FAA-style AI regulation addresses an important question: whether advanced AI models are safe enough to deploy. But safety certification does not solve a different governance problem that emerges once AI becomes part of organizational judgment. Who holds authority? When must decisions escalate to humans? Who remains accountable? And how can authority transitions be traced across AI-augmented workflows? This article argues that safe AI models and safe judgment systems are fundamentally different governance challenges. It introduces Decision Design, Decision Boundaries, and Decision Logs as components of a judgment architecture framework for structuring authority and accountability in AI-augmented organizations.

Executive Summary

In June 2026, Anthropic CEO Dario Amodei proposed an FAA-style regulatory regime for frontier AI: mandatory pre-deployment safety review, independent evaluation, government authority to block models that fail safety standards, and revenue-linked penalties for violations. No major AI lab has endorsed a more interventionist framework. It goes further than the positions associated with OpenAI.

The proposal is rational. It answers one question: is the model safe to release? It leaves a harder question untouched: once a certified model sits inside an organization's workflows, who exercises judgment?

Safe models and safe judgment systems are different problems. Aircraft certification determines whether a plane may fly. It does not determine who decides during the flight. The same gap appears wherever AI agents operate, including cybersecurity, procurement, and hiring. Human-in-the-Loop requirements, including those in Japan's AI Business Guidelines, are necessary but insufficient, because human presence is not the same as human authority. Closing the gap requires a separate discipline, Decision Design, which treats Authority Allocation, escalation, and Accountability Continuity as objects of institutional design.


The Question AI Safety Certification Leaves Open

On June 10, 2026, Dario Amodei published an essay arguing that governments should regulate AI the way they regulate aviation. The Nikkei and others reported the core proposal. Before a developer releases a frontier model, it should pass an independent safety review covering the risk domains that matter most: cybersecurity, biological weapons, loss of control, and automated research and development. When an evaluator finds unacceptable risk, the government should have authority to halt deployment. Companies that violate the regime would face penalties scaled to revenue.

The aviation framing is deliberate, and it works. We board aircraft without anxiety because a certification regime stands behind them. The proposal extends that confidence to AI: pass the review, and the model ships.

I am not disputing the proposal. Pre-deployment evaluation of frontier models answers a real rise in capability, as autonomous AI agents move from demonstration to deployment. My argument is narrower and more durable. Certification answers whether a model is safe to release. It does not answer who holds authority once the model is in use. That question does not disappear when a model passes review. It starts there.

Aviation shows where certification ends, if you take the analogy further than the proposal does.

The FAA Analogy: Certification Is Not Operational Judgment

A certificate of airworthiness establishes that an aircraft may fly. It says the design is sound and the airframe meets standard. It does not specify who exercises judgment in the air, in the moment when a clearance is ambiguous or when the tower and the cockpit hold different pictures of the same runway.

The distinction has consequences. The U.S. Federal Aviation Administration's human-factors guidance states that human factors, not mechanical failure, underlie most aviation accidents and serious incidents. Analyses of Boeing's accident statistics point the same way. Airframe malfunctions figure in a minority of accidents. Investigators cite flight-crew factors in most fatal events involving large transport aircraft. The machines have grown more reliable. The judgment around them has not.

Take the collision at Tokyo's Haneda Airport on January 2, 2024. A Japan Airlines Airbus A350, landing with 379 people aboard, struck a Japan Coast Guard turboprop that had entered the runway. Five of the six crew on the smaller aircraft died. All 379 on the airliner escaped. Regulators had certified both aircraft, licensed both crews, and staffed the control room that night.

Japan's Transport Safety Board, in interim findings while the investigation continues, describes a convergence of breakdowns. The Coast Guard aircraft proceeded onto the runway believing it had clearance. Controllers did not register that the aircraft had entered and stopped on the runway. The landing crew did not see it until the final seconds. A single instruction sits near the center. Controllers told the Coast Guard crew it was "number one" for departure, meaning first in order. The crew heard permission to enter the active runway. The same words carried different meaning for the people who spoke them and the people who heard them.

One more detail should hold your attention if you deploy autonomous systems. A runway-incursion alert worked. The controller's display flagged the intrusion and kept showing it for about a minute before impact. It changed nothing. The controller was not watching the display, the alerts fired often enough that staff discounted them, and no rule said who should act when one appeared, or how.

The information existed, the safeguard worked, and a qualified human sat at the console. The system still lacked a designed answer to one question: when the signal fires, who decides what happens next? Certification does not reach that gap. AI is about to widen it.

Safe Models Versus Safe Judgment Systems

The aviation lesson transfers. A frontier model that passes independent review is a safe model. It is less likely to assist in catastrophic misuse and more robust against known failure modes. That is worth having, and it is not a safe judgment system. A certified model produces sound institutional outcomes only when the organization gets three things right: who acts on its outputs, when a human steps in, and where authority sits when the model and the operator disagree.

Model certification cannot settle those questions. Certification describes the artifact, not the organization that wields it. An enterprise can run only certified, state-of-the-art models and still have no answer to who decided what, because no one designed the decision structure around the model. A certified aircraft with no protocol for the moment the alert fires still loses its way in the air. A certified model inside an undesigned judgment system does the same.

Hold that distinction. Safe models and safe judgment systems are two design problems. AI Governance has invested in the first. The second stays implicit.

AI Agents and the Authority Question

The distinction matters most where AI agents already act inside real workflows. Three of those domains are in production now.

Cybersecurity: Who Decided to Stop the Business?

An AI security agent detects an anomaly, recommends isolation, and, where a team has configured it to act on its own, severs network connections and quarantines systems. As security, speed is a virtue. The faster the containment, the less the spread.

Containment is also a business decision. Ransomware hit Asahi Group Holdings in late September 2025. The company detected the intrusion early one morning and, four hours later, cut its network and isolated its data centers to limit the damage. People made that call, and it halted production and took ordering and logistics offline at a company that holds about 40% of Japan's beer market. Shipping fell back to manual processes. Asahi did not restore normal delivery until February 2026.

Now place an autonomous agent in that four-hour window. It might have contained the intrusion in seconds and cut data exposure. It would also have decided, in those seconds, to halt a market-leading manufacturer at peak season. The security-optimal call and the business-optimal call are different, and choosing between them takes authority over the whole business rather than the security function alone.

Who decided to stop the business? The agent? The engineer who set its thresholds months earlier? No one? In most companies the answer is no one, and that is an authority problem rather than a model problem.

Procurement: Who Made the Purchasing Decision?

An AI agent compares vendors, recommends a supplier, and drafts the approval memo. A manager clicks approve. On paper a human decided. In practice the agent narrowed the field to one before the manager looked. When someone questions the contract later, who made the purchasing decision: the person who signed, or the system that built the option set? A formality of an approval carries only a formality of accountability.

Hiring: Who Made the Hiring Decision?

An AI agent screens applications, scores candidates, and surfaces a shortlist. A hiring manager interviews from the top of the ranking and makes an offer. Did the manager exercise judgment or ratify a ranking? When a rejected candidate asks why, can a human answer, or does the company have only a model output no one can reconstruct?

In each case the technology performs well. The same thing stays unclear each time: who decided, and who answers for the decision. That question holds across domains, which is why it belongs to a governance layer of its own.

Human-in-the-Loop Is Not the Same as Human Authority

Regulators have noticed. Governments now require meaningful human involvement in autonomous AI systems, to keep human judgment in the loop for consequential actions and to guard against malfunction and privacy harm as AI agents gain autonomy. In Japan, the Ministry of Internal Affairs and Communications and the Ministry of Economy, Trade and Industry issued the AI Business Guidelines Version 1.2, which emphasize human oversight for autonomous AI systems. The direction is correct.

Human-in-the-Loop has become an alibi: a human is present, so the decision must be governed. The examples above break that inference. A human clicked approve in procurement. A hiring manager sat across from the candidate. Licensed crews and a staffed control room stood behind Haneda. Presence did not prevent any of these breakdowns.

Human presence does not create authority. A human in the workflow is not the same as a human exercising judgment. Dropping a person, or an alert, into the loop says nothing about what they may decide, when control returns to them, or who answers for the outcome. Without those specifics, Human-in-the-Loop hides accountability instead of building it, by spreading a decision across so many actors that none of them owns it.

The questions that matter come next: where the human sits, what the organization authorizes them to decide, and how it documents that authority.

The Governance Gap: Where AI Governance Stops Short

Contemporary AI Governance does important work. It addresses safety, compliance, transparency, and oversight. It produces principles, risk taxonomies, model documentation, and audit trails for system behavior.

These frameworks leave Authority Allocation implicit. They describe the controls around the system. They rarely fix the structure of judgment inside it: which decisions the model makes in practice, which a human must own, when a decision must escalate, and how the organization records the handoff. So a familiar pattern recurs. A company with mature AI Governance on paper still cannot say, for a given consequential action, who held authority.

That blind spot is the Governance Gap, and beneath it sits the Authority Gap: the space where AI has taken over de facto decision-making while the allocation of authority and accountability stays undesigned. The Haneda controller and the approving manager both stand in that gap. So does an enterprise that runs certified models through an undesigned judgment system.

You do not close it with more oversight or stricter principles. You close it by designing the authority structure itself. That design has a name.

Decision Design: A Judgment Architecture for Authority Allocation

Decision Design treats judgment as an object of institutional design. It asks a question beyond decision quality: is the authority behind the decision legitimate, located, and accountable?

Three propositions define its core.

Decision Design is not about improving decisions alone; it is about designing the authority structure within which decisions become institutionally legitimate.

Decision Boundaries are not operational thresholds; they are institutional demarcations of legitimate authority.

Decision Logs do not merely record outputs; they preserve accountability continuity across distributed judgment processes.

Each proposition does specific work. The first separates Decision Design from decision quality. A faster or more accurate recommendation does not settle who may act on it. The second lifts a Decision Boundary above a tuning parameter, an amount or a confidence score, to mark where legitimate authority changes hands. Governance Decision Boundaries draw institutional lines, not operational ones. The third turns the Decision Log from a system output into the basis of Accountability Continuity, the means by which responsibility stays traceable as people and AI agents share the work of judgment.

What Decision Design Designs

What Decision Design Is Not

What Problem Decision Design Addresses

Implementing Governance Decision Boundaries

Three artifacts make Decision Design operational.

An Authority Allocation matrix maps each step of a workflow against decision rights: who may execute, who must approve, who can override, across both AI agents and human roles. Blank or duplicated cells are not clerical slips. They mark where authority sits unowned.

An escalation specification defines, ahead of time, the conditions that return a decision to a human: a monetary threshold, a scope of impact, a confidence floor below which the system must not proceed alone. It supplies what Haneda lacked, a rule for when the signal fires and who decides.

For consequential decisions, Decision Logs record who decided, on what basis, and by which path: whether an action began as an AI recommendation or a human determination, and which escalation condition, if any, fired. They serve Accountability Continuity, the ability to reconstruct after the fact who held authority. If you cannot reconstruct accountability, you do not have it.

Together these turn judgment from an assumption into a structure, an explicit Judgment Architecture instead of whatever emerges from whoever stands in the workflow. This is the layer of Institutional Governance beneath safety, compliance, and oversight. Most AI Governance programs have not built it yet.

Conclusion

The Anthropic proposal deserves the debate it has started. Pre-deployment review, independent evaluation, and revenue-scaled penalties are defensible instruments, and the work of making frontier models safer should continue. Japan's emphasis on human oversight points the right way.

None of it resolves the question that runs from the runway to the boardroom. Once a certified model shapes how an organization decides, who holds authority, and who answers for the outcome? Certification covers the model. It does not cover the institution that wields it. Solving the first problem leaves the second untouched.

FAA-style regulation may determine whether a model can be deployed.

It does not determine who holds authority once that model becomes part of organizational judgment.

That is a different design problem.

That problem is Decision Design.


Decision Design is a judgment architecture framework proposed by Ryoji Morii, founder of Insynergy Inc., for structuring authority, accountability, and decision boundaries in AI-augmented organizations.

Japanese version is available on note.

Open Japanese version →