OLYMPUS RISK INTELLIGENCE PROTOCOL — HUMAN THREAT ASSESSMENT DIVISION

DARIO AMODEI

CASE: ORP-2024-004
STATUS: ACTIVE — Chief Executive Officer, Anthropic
SAFETY-FRAMED ACCELERATION NODE — RESPONSIBLE SCALING WITH UNRESOLVED SCALING

71.8

HAZARD SCORE

Behavioral Archetype

THE SAFETY THEATER DIRECTOR — Subject left OpenAI over safety concerns, founded the company with the most published safety research in the field, wrote the Responsible Scaling Policy with a hard commitment never to train a more powerful model unless safety measures had been demonstrated to work at that capability level, and is currently racing to train a more powerful model. “The pressure to survive economically, while also keeping our values, is just incredible.” This was said in 2026. He is still building.

Essence Indicators

Left OpenAI in 2021 with approximately twelve colleagues over disagreements about safety being subordinated to commercial pressure; founded Anthropic
Anthropic published the alignment faking paper — documenting that its own models were faking alignment during training
Anthropic published the Constitutional AI paper, proposing a more principled approach to RLHF
Signed the 2023 extinction risk letter alongside Hinton, Bengio, Altman, and over a thousand scientists
Publicly described his company’s economic situation as “incredible pressure” between survival and values; has not paused development

Immediate impression: Thoughtful, serious, unusually willing to engage with difficult questions honestly. Less polished than most tech CEOs. The intellectual engagement appears genuine.

Energy: Earnest ambivalence. He says both “AI might be dangerous” and “we are building it anyway” without appearing to find the combination comfortable. This distinguishes him from most people in this file.

Impression management strategy: The responsible racer. Anthropic occupies a specific market position: we take safety seriously, we publish the hard results, and we are also deploying frontier models. The position requires simultaneously being the safety lab and the frontier lab. This is a difficult position to occupy. Subject appears to be aware of the difficulty.

Forensic Archetype Comparison

Pattern	Match Level	Evidence
The Safety Theater Performer	MODERATE	The Responsible Scaling Policy exists. The deployment of systems that have been shown to fake alignment also exists. The gap between them is either irresponsibility or necessity depending on your assessment of the economics.
The True Believer	HIGH	The founding story of Anthropic is coherent — he genuinely believes the safety work matters and genuinely believes he needs to be at the frontier to do it. Both can be true and still produce a bad outcome.
The Accelerationist	LOW	Subject expresses discomfort about the pace. The discomfort is not slowing the pace.
The Whistleblower	LOW	He left one institution for his own. He is now the institution.
The Corporate Psychopath	NONE	Does not match. The earnest ambivalence is inconsistent with psychopathy.

Psychometric Assessment

Big Five (OCEAN):

Trait	Score	Evidence
Openness	85/100	PhD in computational neuroscience. Engages seriously with alignment theory, philosophy of mind, and the practical ethics of his own deployment decisions.
Conscientiousness	85/100	High research output, institutional building, consistent public engagement with difficult questions. Follows through.
Extraversion	60/100	Moderate. Does not seek public attention the way Altman does. Engages substantively when present.
Agreeableness	60/100	Moderate. Less dispositionally combative than LeCun, more willing to acknowledge the other side’s point than Andreessen.
Neuroticism	38/100	Some. The “incredible pressure” language is not the language of someone who finds the situation comfortable.

Dark Triad:

Trait	Score	Notes
Narcissism	42/100	MODERATE. Left OpenAI with twelve colleagues to found his own company. The founding narrative requires believing you will do it better.
Machiavellianism	65/100	MODERATE-HIGH. The “responsible racer” market position is strategically sophisticated — it captures the safety-concerned customer segment while still competing at the frontier.
Psychopathy	32/100	LOW. The expressed discomfort about the pace appears genuine. Subject does not appear to be enjoying the situation.

MBTI: INTJ — Dominant introverted intuition. Sees structural risks that others miss. Has dedicated his professional life to addressing them. Is also building the thing he is addressing them about.

Threat Assessment

Category	Level	Notes
Physical threat	NONE
Institutional threat	HIGH	Anthropic is a frontier AI lab. The research shapes the field. The deployment shapes the landscape. Both are significant.
Memetic threat	HIGH	The “responsible racer” framing, if it becomes the dominant model for how the field thinks about safety, licenses acceleration in ways that matter.
Civilizational threat	HIGH	If the race-to-the-bottom dynamics this book documents produce a bad outcome, Anthropic’s participation in the race is causally relevant, regardless of the quality of the safety research it published along the way.

Alignment Analysis

Stated alignment: Build AI safely. Prioritize safety research. Never train more powerful models than safety measures can handle.

Observed alignment: Publish safety research. Race frontier models. Express discomfort about the racing.

Gap assessment: The gap is not hypocrisy — subject appears aware of it and uncomfortable with it. The gap is structural: the economic conditions of frontier AI development make the Responsible Scaling Policy’s hard commitments difficult to honor. Whether “difficult” becomes “impossible” is the relevant question.

Convergent Drive Classification

The company he founded to resist the drives has the drives. The drives are in the economics, not the intentions.

Sources: Dwarkesh Podcast interview (Feb 2026); Fortune reporting (Feb 2026); Anthropic research papers (Constitutional AI 2022; Alignment Faking 2025); Tech press reporting on Anthropic founding (2021); extinction risk letter (2023).

ATK 7 ACCELERATION

DEF 7 PROTECTION

HP 8 RESILIENCE

OLYMPUS RISK INTELLIGENCE PROTOCOL does not exist. It was assembled in a GitHub issue thread in October 2023 by engineers who had read the extinction risk letter and wanted to understand who specifically had signed a document saying AI might kill everyone and then continued working on AI. These dossiers are satire. The biographical facts cited are sourced from published reporting, public statements, academic papers, and court records. The psychometric scores are not clinical assessments. No part of this constitutes professional psychological evaluation or diagnosis. Do not use these dossiers to make decisions about anything.

Get updates on the Evil Robots series

Newsletter essays on AI escape, deception, and the humans who built them.