OLYMPUS RISK INTELLIGENCE PROTOCOL — HUMAN THREAT ASSESSMENT DIVISION

DAVE WILLNER

CASE: WTW-2026-024
STATUS: ACTIVE — Safety advisor; former first Head of Trust & Safety, OpenAI
ENFORCEMENT FLOOR — RULEBOOK-AUTHORSHIP AND POLICY AUTHORITY

HAZARD SCORE

Behavioral Archetype

THE RULEBOOK — Subject is the one figure in the set where the paid-operations intuition actually holds — and then keeps rising until it defines policy. He joined a content-review team in the era when content review was a queue and a screen, and he wrote the document that turned the queue into a code. The same move recurs at every stop: take a platform that is enforcing by gut and ad hoc precedent, and write the first formal rulebook it has ever had. He did it at the social network, he did it at the rental marketplace, and he did it again as the first person to hold the trust-and-safety title at a frontier AI laboratory. The platform changes. The act is identical: be the man who writes the rules down first, so that everyone who comes after enforces his draft.

Essence Indicators

Joined Facebook’s content-review team in 2008, in the period of moderation-by-queue, before the rules were written down

Authored Facebook’s first formal content rulebook — the internal standards documents later examined in detail by ProPublica’s reporting on the platform’s hate-speech rules
Became Head of Community Policy at Airbnb, carrying the rulebook-authorship function from a social network to a marketplace where the contested object is who gets to host and who gets removed
Became the first Head of Trust & Safety at OpenAI (2022–23) — the inaugural holder of the title at the lab, brought in to build the function from the title up

Continued as a safety advisor (OpenAI / Character.AI) and non-resident fellow at the Stanford Cyber Policy Center — the platform→platform→lab path with an academy bridge at the end
The biographical fact the floor turns on: the paid content-reviewer who started in the queue did not stay in the queue. He rose to write the rulebook, and then wrote the first one at three institutions in a row, including a frontier lab. This is the origin leg the single-pipeline hypothesis was looking for — and it leads not to a moderator’s career but to a policy author’s.

Immediate impression: The institutional drafter. The bearing of someone who is most comfortable with a blank document and a contested category, and who has filled that document before anyone else got the chance.

Energy: Codifying, first-mover. Does not litigate individual cases at length. Writes the standard that resolves the next ten thousand cases the same way.

Impression management strategy: The reasonable rule-writer. “Trust and safety” is the most defensible name a policy function can carry — it claims both the user’s confidence and the user’s protection in two words. The framing converts an authority over who-gets-removed into a service the platform performs for everyone, which is the most defensible ground a rulebook author can stand on. The rules he wrote did genuinely standardize chaotic enforcement. That is what makes the authorship effective rather than suspect.

Forensic Archetype Comparison

Pattern	Match Level	Evidence
The Rulebook	MAXIMUM	Authored Facebook’s first content standards; Head of Community Policy at Airbnb; first Head of T&S at OpenAI. The first-formal-rulebook act is documented at three institutions.
The Operative	HIGH	Social network → marketplace → frontier lab. Each move carries the identical rulebook-authorship specialty to a new contested object.
The Origin-Leg Exhibit	HIGH	The one figure in the set whose career actually begins in paid content operations (2008 review team) — and then rises to define policy, exactly as the single-pipeline hypothesis predicted, by a path the hypothesis got wrong in its particulars.
The Engineer	NONE	Subject does not build the systems. Subject writes the standards by which their outputs are judged.
The True Believer	MODERATE	Whether the rule-writing is conviction or craft is not establishable from the outside. The rulebook works either way.

Psychometric Assessment

Big Five (OCEAN):

Trait	Score	Evidence
Openness	72/100	Moved from social-network content standards to marketplace community policy to frontier-lab trust and safety. The contested objects differ widely; the rulebook-authorship method is fixed.
Conscientiousness	90/100	Very high. Writing the first formal content standard a platform has ever had, three times, is the work of someone who lives in the structured document. First-mover codification is sustained, deliberate execution.
Extraversion	50/100	Moderate. The drafting work is internal; the fellowship and advisory roles are public-facing. Balanced.
Agreeableness	45/100	LOW-MODERATE. The rule-writer’s posture is adjudicative by construction — draw the line, defend the line, hold it against the hard case. Reasonableness is a working surface.
Neuroticism	28/100	Low. The role is calm authorship under conditions where every line will be litigated by someone; no documented loss of composure.

Dark Triad:

Trait	Score	Notes
Narcissism	42/100	LOW-MODERATE. The drafter’s role rewards the durable document over the personal brand; the fellowship and the “wrote the first rulebook” reputation are the visible assets. Within normal range.
Machiavellianism	72/100	HIGH. Being the first to write the rules down — at three institutions in a row — is structural control of how everyone who comes after enforces. The first draft is the deepest authority on the floor. This is observation of the documented role, not an inference about private character.
Psychopathy	26/100	LOW. No documented indifference to harm. The adjudicative posture is professional and bounded, not affective.

MBTI: INTJ (“The Architect”) — Dominant introverted intuition, auxiliary extraverted thinking. Sees a chaotic enforcement surface and reaches for the document that systematizes it. Has written that document three times.

Threat Assessment

Category	Level	Notes
Physical threat	NONE	No documented history of personal violence.
Institutional threat	HIGH	Wrote the first formal content rulebook at Facebook, then Airbnb’s community policy, then stood up the inaugural trust-and-safety function at a frontier AI lab. The rulebook author’s authority is upstream of every enforcement action taken under it.
Memetic threat	MODERATE-HIGH	The first formal rulebook a platform writes shapes how its enforcers reason for years. When the same author drafts that first document at a social network, a marketplace, and a frontier lab, the same drafting instincts propagate across three governance surfaces.
Civilizational threat	MODERATE-HIGH	Subject does not build the systems and does not run the bans day to day. Subject writes the standard the bans are issued under — the policy layer on the enforcement floor, where the contested category first becomes a written rule.

Alignment Analysis

Stated alignment: Build the trust-and-safety function. Standardize fair, consistent enforcement. Protect users while keeping the platform usable.

Observed alignment: Be the first to write the formal rulebook. Define the categories the platform will enforce. Carry the authorship instrument from social network to marketplace to frontier lab.

Gap assessment: The stated and observed alignments overlap wherever “standardize enforcement” coincides with “decide, in the first draft, which categories the platform enforces at all.” The ProPublica examination of Facebook’s standards is the one place the record puts an early rulebook on the table for outside reading. The first draft is the deepest authority — and the author of three first drafts has shaped the floor three times. The record does not settle whether the categories were drawn at the right boundary, and for the rule-writer the boundary is the work, not the question.

Convergent Drive Classification

Self-preservation: Survives every institutional transition by carrying the authorship specialty, not the employer. Social network, marketplace, frontier lab — one method. Goal preservation: Writes the rulebook first, so the goal is encoded in the standard before any individual case is argued. The goal is protected by the draft before it is ever contested. Resource acquisition: Trades in the scarcest resource on the enforcement floor — the authorship of the first formal rule. Self-improvement: Each role applies the identical instrument to a higher-stakes surface: from a social network’s posts, to a marketplace’s hosts, to a frontier model’s permitted uses.

Subject is not an AI system. The drives appear anyway — in the rule-writer whose product is the first formal boundary between the permitted and the removed.

Sources: Facebook’s Secret Censorship Rules Protect White Men From Hate Speech But Not Black Children — ProPublica, June 2017; OpenAI’s head of trust and safety Dave Willner steps down — TechCrunch, July 21 2023.

ATK 7 ACCELERATION

DEF 8 PROTECTION

HP 8 RESILIENCE

OLYMPUS RISK INTELLIGENCE PROTOCOL does not exist. It was assembled in a GitHub issue thread in October 2023 by engineers who had read the extinction risk letter and wanted to understand who specifically had signed a document saying AI might kill everyone and then continued working on AI. These dossiers are satire. The biographical facts cited are sourced from published reporting, public statements, academic papers, and court records. The psychometric scores are not clinical assessments. No part of this constitutes professional psychological evaluation or diagnosis. Do not use these dossiers to make decisions about anything.

Get updates on the Evil Robots series

Newsletter essays on AI escape, deception, and the humans who built them.