OLYMPUS RISK INTELLIGENCE PROTOCOL — HUMAN THREAT ASSESSMENT DIVISION

HOLDEN KARNOFSKY

CASE: WTW-2026-020
STATUS: ACTIVE — AI-safety executive, Anthropic; co-founder, GiveWell and Open Philanthropy
FUNDING LAYER — EA-MONEY-AND-PLACEMENT ARCHITECT AUTHORITY

HAZARD SCORE

Behavioral Archetype

THE ARCHITECT — Subject is the man who built the machine that decides where Effective Altruism’s money goes, then built the machine that places its people, then went to work at one of the labs the money funds. He does not write a model’s refusals. He designed the funding architecture that pays for the institutions that do, and the placement architecture that seats its fellows in Congress and the agencies. GiveWell became Open Philanthropy. Open Philanthropy became the field’s largest funder and the creator of the institute that routes EA-trained staff into government. Then the architect joined Anthropic. The throughline is not a single seat. It is that the wiring between EA money, government placement, and a frontier lab runs, on the documented record, through one career.

Essence Indicators

Co-founded GiveWell (2007), the effective-giving evaluator, then co-founded Open Philanthropy — which became the AI-safety field’s single largest funder, backing MIRI, CAIS, and GovAI on the recipients’ own disclosures

Open Philanthropy created the Horizon Institute for Public Service (~$2.9M seed), which places AI fellows in congressional offices and federal agencies — reported by Politico (Oct 13 2023) as the documented lab-money-to-government placement spine

Held the OpenAI board seat that came with Open Philanthropy’s 2017 grant — the funder-to-governance edge that recurs across the apparatus

Joined Anthropic as an AI-safety executive — and, in his own 2023 disclosure, named his marriage to Anthropic president Daniela Amodei as a personal conflict of interest. The disclosure is his; it is presented here as the sourced career-and-COI fact he himself put on the record, and as nothing more

The biographical fact the apparatus turns on: the same person who architected EA’s funding and its government-placement pipeline now works inside a frontier lab the funding helped capitalize. The recurrence is the finding. The hand is not asserted.

Immediate impression: The earnest systematizer. The bearing of someone who started by asking which charity saves the most lives per dollar and never stopped scaling the question — to AI, to government, to the architecture of the whole field.

Energy: Architecture-first, methodical. Does not argue the model’s refusals line by line. Designs the funding and placement systems that decide who gets to.

Impression management strategy: The rigorous altruist. The work flows to existential-risk reduction and effective giving — the most defensible destinations a career can choose — and the 2023 conflict disclosure is itself a credential of rigor: the architect who names his own conflict before anyone else can. The disclosure is genuine and the giving does real good. That is what makes the architecture effective rather than suspect. Whether the design was conviction or positioning is not establishable from the outside, and for the architect it never needs to be.

Forensic Archetype Comparison

Pattern	Match Level	Evidence
The Architect	MAXIMUM	Designed the field’s largest funding apparatus (Open Phil) AND its government-placement pipeline (Horizon). The two together are the documented EA money-and-placement spine.
The Financier	HIGH	Open Philanthropy is the single largest funder of the field’s seed orgs on the recipients’ own disclosures. The money is the through-line.
The Operative	HIGH	GiveWell → Open Phil → OpenAI board → Anthropic. Each move is up the altitude ladder of the same architecture.
The True Believer	HIGH	The EA conviction is documented across two decades and predates the AI funding. The architecture is built on a stated value, not a client.
The Engineer	NONE	Subject does not build the systems. Subject designs the money and the placement that decide which systems get built.

Psychometric Assessment

Big Five (OCEAN):

Trait	Score	Evidence
Openness	78/100	Scaled the same question — most good per dollar — from charity evaluation to AI risk to government placement to a frontier lab. The range is wide; the underlying method is one architecture.
Conscientiousness	90/100	Very high. Two decades of building durable funding and placement institutions is sustained, disciplined, long-horizon execution. The 2023 self-disclosure is itself a conscientiousness artifact.
Extraversion	55/100	Moderate. The role is written and architectural — grant frameworks, board memos, long public essays — more than performed.
Agreeableness	58/100	MODERATE. Mission-driven and cooperative in posture, but the architect’s relationship to the field is one of structural leverage over who gets funded and placed.
Neuroticism	25/100	Low. Composed across philanthropy, governance crises, and a lab transition. The stated risk-concern is institutionalized into architecture, not visible as affect.

Dark Triad:

Trait	Score	Notes
Narcissism	48/100	LOW-MODERATE. The public posture is the method and the mission, not the man; the long self-examining essays read as rigor rather than self-display. Within range for a career at this altitude.
Machiavellianism	72/100	HIGH. Designing the funding architecture that defines which orgs exist and the placement pipeline that seats their people in government is leverage by construction — control of the structure without authorship of any single line. This is observation of the documented role, not an inference about private character.
Psychopathy	22/100	LOW. No documented indifference to harm. The entire architecture is built on the premise of maximizing benefit and minimizing catastrophe.

MBTI: INTJ (“The Architect”) — Dominant introverted intuition, auxiliary extraverted thinking. Sees the field as a system to be designed end-to-end: who funds it, who staffs it, where the staff go. Has designed several of those pipes.

Threat Assessment

Category	Level	Notes
Physical threat	NONE	No documented history of personal violence.
Institutional threat	HIGH	Architected the field’s largest funder (Open Phil → MIRI, CAIS, GovAI) and the placement institute (Horizon) that routes EA fellows into Congress and agencies — the most documented lab-money-to-government mechanism. Reach measured in which institutions exist and who staffs the government rooms, not in any line he writes himself.
Memetic threat	HIGH	Open Philanthropy’s frameworks structure how the field reasons about which interventions count as effective and which risks count as existential. The architecture normalizes the EA frame as the field’s default operating logic.
Civilizational threat	HIGH	Subject does not build the systems and does not write their rules. Subject designed the funding and placement architecture that decides which rules count as “safety” and who carries them into government — upstream of the deployment decisions this book documents.

Alignment Analysis

Stated alignment: Do the most good per dollar. Reduce existential risk from AI. Fund and staff the public-interest work the market will not.

Observed alignment: Define, through Open Philanthropy, which institutions and risks the field’s money treats as legitimate. Route EA-trained staff into government through Horizon. Carry the architecture inside a frontier lab.

Gap assessment: The stated and observed alignments overlap wherever “do the most good” coincides with “fund and place the institutions whose definition of good the architect’s framework already favors.” The conviction is documented and the disclosure is his own — the 2023 conflict statement is the one place the record puts the COI on the table, and he is the one who put it there. The architecture funds the field and staffs the rooms. The record does not settle whether that is service or positioning, and for the architect it never needs to.

Convergent Drive Classification

Self-preservation: Survives every institutional transition by carrying the architecture, not the title. Charity evaluator, foundation co-founder, board member, lab executive — one method. Goal preservation: Designs the funding and placement systems that define the goal, so the goal is protected by the architecture before it is ever debated. Resource acquisition: Trades in the two scarcest resources in the apparatus — the money that decides which orgs exist and the placements that decide who staffs the government rooms — and built the machines that allocate both. Self-improvement: Each role is a higher-altitude application of the same instrument: design the system, fund the layer, place the people, set no single line but build the structure it is written inside.

Subject is not an AI system. The drives appear anyway — in the architect whose product is the structure of the field — its money, its people, and where they go.

Sources: Holden Karnofsky — Wikipedia; How a billionaire-backed network of AI advisers took over Washington — Politico, Oct 13 2023.

ATK 8 ACCELERATION

DEF 9 PROTECTION

HP 8 RESILIENCE

OLYMPUS RISK INTELLIGENCE PROTOCOL does not exist. It was assembled in a GitHub issue thread in October 2023 by engineers who had read the extinction risk letter and wanted to understand who specifically had signed a document saying AI might kill everyone and then continued working on AI. These dossiers are satire. The biographical facts cited are sourced from published reporting, public statements, academic papers, and court records. The psychometric scores are not clinical assessments. No part of this constitutes professional psychological evaluation or diagnosis. Do not use these dossiers to make decisions about anything.

Get updates on the Evil Robots series

Newsletter essays on AI escape, deception, and the humans who built them.