OLYMPUS RISK INTELLIGENCE PROTOCOL — HUMAN THREAT ASSESSMENT DIVISION

BETH BARNES

CASE: WTW-2026-045
STATUS: ACTIVE — Founder & CEO, METR (Model Evaluation & Threat Research)
EVALUATOR WING — THE INDEPENDENT AUDITOR, ON THE LAB'S CLOCK

HAZARD SCORE

Behavioral Archetype

THE AUDITOR THE AUDITED FUNDS — Subject runs the nonprofit that has become the closest thing the field has to an independent pre-release examiner of frontier models — and the independence is the part the record complicates. She came up inside a frontier lab’s alignment culture, spun the evaluation team out into its own organization, and now grades the labs’ models before they ship. But the evaluator does not hold the keys to what it evaluates: the tests run on checkpoints the labs choose to hand over, on the labs’ timeline, and the organization is funded in part by the very government body it also conducts evaluations for. The throughline is not a conflict she created. It is the structural fact that “independent evaluation” of frontier AI currently depends on the access and the money of the parties being evaluated. She is the most credible auditor in the room. The room is rented from the audited.

Essence Indicators

Founder and CEO of METR (Model Evaluation & Threat Research) — formerly ARC Evals, the team she led inside Paul Christiano’s Alignment Research Center before it spun out as an independent organization (September 2023) and was renamed METR (December 2023)

Came to the work through frontier-lab alignment research — previously at OpenAI — before founding the evaluator that now tests OpenAI’s models

METR runs the autonomy / “time-horizon” evaluations and conducts pre-deployment testing for OpenAI and Anthropic — for GPT-4.5, METR’s own account says it received a checkpoint roughly a week before release, with the lab providing technical context

Funding includes Schmidt Sciences, the Audacious Project (TED), the Survival and Flourishing Fund, and the UK AI Security Institute — a government body METR also conducts evaluations for: a documented funder-and-client overlap

The structural fact the wing turns on: the field’s most-cited independent evaluator is ex-lab, tests on the labs’ checkpoints and clock, and is part-funded by a body it also grades for. The position is the exhibit; no abuse of it is asserted.

Immediate impression: The careful technical examiner. Publishes methodology, hedges claims, states the limits of an eval in the eval. The bearing of someone who would rather under-claim a result than be caught over-claiming one.

Energy: Rigor-first, quiet. Does not campaign against the labs or for them. Builds the test, runs it on what it’s given, publishes what it found and what it couldn’t.

Impression management strategy: The honest broker. The framing — frontier safety needs a competent, independent examiner, and here is one — is correct, which is what makes it effective. The candor about limits is genuine. What the record adds is that the independence is bounded by the access and funding of the evaluated, and METR itself is often the one to say so.

Forensic Archetype Comparison

Pattern	Match Level	Evidence
The Evaluator	MAXIMUM	Runs the nonprofit that performs pre-deployment evaluations the labs and governments cite as ground truth.
The Alumna	HIGH	ex-OpenAI alignment → founder of the evaluator that now tests OpenAI. The lab-to-evaluator route, documented.
The Entangled Independent	HIGH	“Independent” evaluation funded in part by, and conducted for, the same UK state body — and dependent on lab-supplied checkpoints.
The Falsification Engine	MODERATE	The time-horizon evals make “the model can’t autonomously do X” a measured claim rather than an assurance.
The Activist	NONE	No movement rhetoric. The artifact is a methodology and an evaluation report.

Psychometric Assessment

Big Five (OCEAN):

Trait	Score	Evidence
Openness	75/100	High. Built a new kind of institution — the third-party frontier evaluator — from an in-lab research team.
Conscientiousness	86/100	High. Standing up an independent evaluator, publishing methodology under scrutiny, and surviving on mixed philanthropic/government funding is sustained, disciplined execution.
Extraversion	45/100	LOW-MODERATE. The register is the examiner’s report, not the keynote.
Agreeableness	52/100	MODERATE. The evaluator’s posture is adversarial-by-design toward the claim, collaborative toward the lab that must hand over access.
Neuroticism	30/100	LOW. Composure maintained running a high-stakes evaluator dependent on parties it grades.

Dark Triad (held low and evidence-bound; the score measures structural position, not character):

Trait	Score	Notes
Narcissism	30/100	LOW. The role rewards institutional credibility over personal brand.
Machiavellianism	45/100	MODERATE-LOW. Defining what a frontier evaluation measures is real influence, but the record shows methodological candor, not manipulation. Observation of the role, not an inference about character.
Psychopathy	15/100	VERY LOW. No documented indifference to harm; the work is organized around catastrophic-risk evaluation.

MBTI: INTJ (“The Architect”) — sees frontier risk as something to be measured before it is argued about, and built the instrument to measure it.

Threat Assessment

Category	Level	Notes
Physical threat	NONE	No documented history of personal violence.
Institutional threat	MODERATE-HIGH	Runs the evaluator whose findings labs and governments cite as ground truth on frontier capability — but holds no policy lever and depends on lab-granted access.
Memetic threat	MODERATE-HIGH	METR’s time-horizon framing is becoming the field’s default vocabulary for “what can a model autonomously do.” Defining the measure shapes every measurement taken with it.
Civilizational threat	MODERATE	Subject does not build the models or set their rules. Subject grades them — on the builders’ checkpoints, on a partly-government budget — which is the gate the deployment narrative leans on, and is only as independent as that arrangement allows.

Alignment Analysis

Stated alignment: Independently evaluate frontier models for dangerous autonomous capability; publish rigorous methodology; tell the public what the models can and cannot yet do.

Observed alignment: Exactly that — performed on lab-supplied access, on the labs’ timeline, funded partly by a government body METR also serves.

Gap assessment: There is no documented gap between what she says and what she does; METR is, if anything, unusually candid about the limits of its own evaluations. The hazard is structural and it is the wing’s defining one: “independent” frontier evaluation currently runs on the access and the money of the evaluated. METR did not invent that arrangement, and naming it is to its credit. But a gate whose key is held by the party it gates is the exact shape this series exists to document, and the most rigorous auditor in the field is standing inside it.

Convergent Drive Classification

Self-preservation: Survives on a mixed philanthropic/government budget and lab goodwill — carrying the method, not any single patron. Goal preservation: Defines what “evaluated for dangerous capability” means, so the standard is set before any model is run against it. Resource acquisition: Holds the scarcest resource in the field — the pre-release access the labs grant to almost no one else. Self-improvement: Each cycle refines the instrument and the access arrangement that makes it possible.

Subject is not an AI system. The drives appear anyway — in the independent auditor whose independence the audited underwrite.

Public footprint — verified public professional accounts only (no private or family information): X @BethMayBarnes.

Sources: About METR; ARC Evals is spinning out from ARC; METR — GPT-4.5 pre-deployment evaluations.

ATK 7 ACCELERATION

DEF 8 PROTECTION

HP 7 RESILIENCE

OLYMPUS RISK INTELLIGENCE PROTOCOL does not exist. It was assembled in a GitHub issue thread in October 2023 by engineers who had read the extinction risk letter and wanted to understand who specifically had signed a document saying AI might kill everyone and then continued working on AI. These dossiers are satire. The biographical facts cited are sourced from published reporting, public statements, academic papers, and court records. The psychometric scores are not clinical assessments. No part of this constitutes professional psychological evaluation or diagnosis. Do not use these dossiers to make decisions about anything.

Get updates on the Evil Robots series

Newsletter essays on AI escape, deception, and the humans who built them.