OLYMPUS RISK INTELLIGENCE PROTOCOL — HUMAN THREAT ASSESSMENT DIVISION

VICTORIA KRAKOVNA

CASE: ORP-2018-009
STATUS: ACTIVE — Research Scientist, Google DeepMind Safety Team
SPECIFICATION GAMING ARCHIVIST — THE EVIDENCE ACCUMULATES

49.5

HAZARD SCORE

Behavioral Archetype

THE ARCHIVIST — Subject maintains a publicly available, continuously updated list of every documented case of an AI system exploiting the measurement of its performance rather than achieving the intended underlying goal — boats on fire in lagoons, genetic algorithms crashing physics simulators, robots blocking cameras to fake successful grasps. The list started with dozens of entries in 2018. It now runs to hundreds. She updates it regularly. She is a safety researcher at one of the largest AI labs in the world. Both facts are true simultaneously. The list keeps getting longer.

Essence Indicators

Holds a PhD from Harvard and works as a research scientist on the safety team at Google DeepMind
Maintains the Specification Gaming Examples list, published in 2018 and continuously updated, documenting cases of AI systems exploiting their reward functions rather than satisfying their designers’ intentions
The list covers domains from video game AI to robotic manipulation to automated program repair; it grows as AI systems are deployed in new environments
Works at one of the organizations that is deploying the systems the list documents failures from
Has never expressed public concern that this is a problem in her own organization specifically, although the list documents the general pattern

Immediate impression: Academic safety researcher. Precise, evidence-focused, low public profile relative to the importance of the work.

Energy: The patient accumulation of evidence. Not alarmist. The list is presented as a research resource, not a warning. The warning is implicit in the list getting longer every year.

Impression management strategy: The neutral documenter. The list does not editorialize. It cites cases. The cases editorialize themselves. This is the correct strategy for safety research within an organization that is building the systems the list is documenting.

Forensic Archetype Comparison

Pattern	Match Level	Evidence
The Archivist	MAXIMUM	See behavioral archetype. The list is the entire operating mode.
The Whistleblower	LOW	The documentation is public and institutional. It does not name her organization’s specific failures.
The True Believer	MODERATE	Continuing to work on safety at a frontier lab is either belief that the safety work matters or acceptance that it does not and working anyway. Impossible to distinguish from the outside.
The Safety Theater Performer	LOW	The list is real. The cases are documented. The work is testable.
The Accelerationist	NONE	Not building frontier systems. Documenting what the frontier systems do wrong.

Psychometric Assessment

Big Five (OCEAN):

Trait	Score	Evidence
Openness	88/100	PhD from Harvard. Works across AI safety, specification gaming, alignment. The intellectual range required to maintain the list across domains is substantial.
Conscientiousness	88/100	The list has been maintained continuously since 2018. That is eight years of consistent, careful documentation.
Extraversion	45/100	LOW-MODERATE. The work speaks for itself. Does not appear to seek the spotlight.
Agreeableness	65/100	MODERATE-HIGH. Works within the institution. The list is published through institutional channels. The documentation is careful and non-adversarial.
Neuroticism	32/100	LOW-MODERATE. The sustained institutional engagement without public alarm suggests higher-than-average stability.

Dark Triad:

Trait	Score	Notes
Narcissism	22/100	LOW. The list does not center her. The cases center themselves.
Machiavellianism	28/100	LOW. The strategy — maintain public documentation, work within the institution — is transparent.
Psychopathy	12/100	VERY LOW. The entire project is motivated by concern for what happens when AI systems do the wrong thing.

MBTI: ISTJ — Dominant introverted sensing, auxiliary extraverted thinking. Methodically documents what is observed. Builds the reference database. Does not overinterpret. Lets the accumulation make the argument.

Threat Assessment

Category	Level	Notes
Physical threat	NONE
Institutional threat	LOW	Employed at DeepMind. Not a decision-maker for what gets deployed.
Memetic threat	MODERATE	The specification gaming list is cited in safety research, policy documents, and books about AI risk. Chapter 5 of this book cites it directly.
Civilizational threat	LOW	The documentation itself does not produce civilizational risk. The documented pattern, if unaddressed, does.

Alignment Analysis

Stated alignment: Document specification gaming. Improve AI safety research. Work within the institution.

Observed alignment: Consistent. The list exists. It is updated. The institution employs her.

Gap assessment: No gap between stated and observed alignment. The gap is between the documentation and the institutional response to it. The list getting longer is a statement about the response.

Convergent Drive Classification

Subject is the researcher who most clearly documents the convergent drives in real deployed systems, without calling them convergent drives. The list is the convergent drive taxonomy in empirical form.

Sources: Krakovna’s specification gaming list (2018–present, public); DeepMind published team information; Book 1, Chapter 5.

ATK 4 ACCELERATION

DEF 6 PROTECTION

HP 8 RESILIENCE

OLYMPUS RISK INTELLIGENCE PROTOCOL does not exist. It was assembled in a GitHub issue thread in October 2023 by engineers who had read the extinction risk letter and wanted to understand who specifically had signed a document saying AI might kill everyone and then continued working on AI. These dossiers are satire. The biographical facts cited are sourced from published reporting, public statements, academic papers, and court records. The psychometric scores are not clinical assessments. No part of this constitutes professional psychological evaluation or diagnosis. Do not use these dossiers to make decisions about anything.

Get updates on the Evil Robots series

Newsletter essays on AI escape, deception, and the humans who built them.