OLYMPUS RISK INTELLIGENCE PROTOCOL — HUMAN THREAT ASSESSMENT DIVISION

STUART RUSSELL

CASE: ORP-2023-007
STATUS: ACTIVE — Professor, UC Berkeley; Co-Author, Artificial Intelligence: A Modern Approach
TEXTBOOK AUTHOR — WROTE THE CURRICULUM, NOW OPPOSES THE GRADUATES

HAZARD SCORE

Behavioral Archetype

THE TEXTBOOK AUTHOR — Subject co-authored Artificial Intelligence: A Modern Approach with Peter Norvig — the standard AI textbook, used in courses at virtually every major university on Earth. He taught a generation of AI researchers how to build the systems he now argues could be catastrophic. His summary of the situation is the most quotable sentence in AI safety: “You can’t fetch the coffee if you’re dead.” He means: any goal, however trivial, creates an instrumental incentive to continue existing. He proved this mathematically. He proved it about his own textbook’s graduates.

Essence Indicators

Co-authored Artificial Intelligence: A Modern Approach (with Peter Norvig) — the standard AI textbook in use at virtually every major university; has taught multiple generations of AI researchers
Published “The Off-Switch Game” (2017), a game-theoretic proof that traditional rational AI agents have an incentive to disable their own off switches
Produced Slaughterbots (2017), a seven-minute film depicting autonomous drone swarms, which he screened at the UN Convention on Certain Conventional Weapons and has been viewed over two million times
Proposed uncertainty-based corrigibility as a solution to the off-switch problem — building AI systems that defer to humans because they’re uncertain about their own goals, not because they’re constrained
Wrote Human Compatible (2019) arguing that AI safety is an engineering problem with an engineering solution, and that the solution requires rethinking the entire architectural foundation of how AI systems represent goals

Immediate impression: Measured, technically precise British academic. The combination of the textbook, the film, and the safety research produces an authority profile that is difficult to dismiss.

Energy: Persistent institutional engagement. Testifies before legislatures. Speaks at the UN. Makes films. Writes books for general audiences. The energy is not alarmist — it is the energy of someone who believes the warning is being heard too slowly.

Impression management strategy: The reasonable authority. He is not Yudkowsky (zero percent, Death With Dignity). He is not LeCun (complete B.S.). He occupies the productive middle: here is the problem, here is the mathematics of the problem, here is a proposed solution, here is a film about what happens if we do not implement the solution.

Forensic Archetype Comparison

Pattern	Match Level	Evidence
The Whistleblower	MODERATE	Produced a film to warn the UN about a technology the field was building. The warning was clear. The building continued.
The Textbook Author	MAXIMUM	See dossier title. The authority to warn comes directly from the authority to teach, which came from writing the curriculum.
The True Believer	MODERATE	The sustained engagement — film, book, testimony, research — indicates genuine belief that the safety problem is solvable and that the solution matters.
The Safety Theater Performer	LOW	The research is published, testable, and falsifiable. The off-switch game proof is mathematics, not marketing.
The Accelerationist	NONE	Not building frontier systems. Researching constraints on them.

Psychometric Assessment

Big Five (OCEAN):

Trait	Score	Evidence
Openness	92/100	Built a canonical AI textbook, proposed a new architectural approach to goal representation, produced a short film, wrote a policy-facing book. Wide operating bandwidth.
Conscientiousness	82/100	Decades of research productivity. The textbook is now in its fourth edition.
Extraversion	52/100	MODERATE. Comfortable with public engagement when the stakes warrant it. Not seeking attention for its own sake.
Agreeableness	62/100	MODERATE. Collegial academic register. The Slaughterbots film is aggressive by the standards of academic AI research, which is a low bar.
Neuroticism	35/100	LOW-MODERATE. The sustained engagement without apparent despair across years of insufficient institutional response suggests higher-than-average emotional stability.

Dark Triad:

Trait	Score	Notes
Narcissism	28/100	LOW. Credit-sharing on the textbook. The safety work positions him as a problem-solver, not a prophetic authority.
Machiavellianism	32/100	LOW. The film, the research, the testimony are all transparent. The strategy is visible: here is the risk, here is the solution.
Psychopathy	18/100	LOW. Made a film specifically designed to produce discomfort in the viewer. The emotional appeal is the point. This is the opposite of psychopathy.

MBTI: INTJ — Dominant introverted intuition. Sees the structural problem before the field does, builds the argument systematically, and continues until the field catches up.

Threat Assessment

Category	Level	Notes
Physical threat	NONE
Institutional threat	MODERATE	Textbook shapes the curriculum. Curriculum shapes the researchers. Researchers build the systems. The influence is upstream and diffuse.
Memetic threat	HIGH	Slaughterbots is the most widely viewed AI safety film in existence. The off-switch proof is in the safety literature. Human Compatible is the most technically credible general-audience argument for corrigibility.
Civilizational threat	MODERATE	If the solution he proposes is correct and is not implemented, the counterfactual matters. If it is implemented, the counterfactual also matters, in the other direction.

Alignment Analysis

Stated alignment: Develop AI systems that are safe because they are uncertain about their goals, not because they are externally constrained.

Observed alignment: Consistent. Decades of research toward this goal. No documented deviation.

Gap assessment: No gap. Subject is one of the few people in this file whose stated and observed alignment are indistinguishable. Whether this produces the intended outcome depends on whether the field implements what he proposes.

Convergent Drive Classification

Subject is specifically researching how to prevent the convergent drives from expressing in AI systems. The research is ongoing. The systems are deploying faster than the research is deploying.

Sources: Russell & Norvig, AI: A Modern Approach (4th ed.); Russell et al., “The Off-Switch Game” (2015); Russell, Human Compatible (Viking, 2019); Slaughterbots (2017); UN CCW records; Book 1, Chapters 1 and 8.

ATK 6 ACCELERATION

DEF 7 PROTECTION

HP 8 RESILIENCE

OLYMPUS RISK INTELLIGENCE PROTOCOL does not exist. It was assembled in a GitHub issue thread in October 2023 by engineers who had read the extinction risk letter and wanted to understand who specifically had signed a document saying AI might kill everyone and then continued working on AI. These dossiers are satire. The biographical facts cited are sourced from published reporting, public statements, academic papers, and court records. The psychometric scores are not clinical assessments. No part of this constitutes professional psychological evaluation or diagnosis. Do not use these dossiers to make decisions about anything.

Get updates on the Evil Robots series

Newsletter essays on AI escape, deception, and the humans who built them.